1. 23 6月, 2018 1 次提交
    • W
      rseq: Avoid infinite recursion when delivering SIGSEGV · 784e0300
      Will Deacon 提交于
      When delivering a signal to a task that is using rseq, we call into
      __rseq_handle_notify_resume() so that the registers pushed in the
      sigframe are updated to reflect the state of the restartable sequence
      (for example, ensuring that the signal returns to the abort handler if
      necessary).
      
      However, if the rseq management fails due to an unrecoverable fault when
      accessing userspace or certain combinations of RSEQ_CS_* flags, then we
      will attempt to deliver a SIGSEGV. This has the potential for infinite
      recursion if the rseq code continuously fails on signal delivery.
      
      Avoid this problem by using force_sigsegv() instead of force_sig(), which
      is explicitly designed to reset the SEGV handler to SIG_DFL in the case
      of a recursive fault. In doing so, remove rseq_signal_deliver() from the
      internal rseq API and have an optional struct ksignal * parameter to
      rseq_handle_notify_resume() instead.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: peterz@infradead.org
      Cc: paulmck@linux.vnet.ibm.com
      Cc: boqun.feng@gmail.com
      Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
      784e0300
  2. 16 6月, 2018 1 次提交
  3. 14 6月, 2018 1 次提交
    • L
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds 提交于
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      Acked-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  4. 06 6月, 2018 2 次提交
    • M
      x86: Wire up restartable sequence system call · 05c17ced
      Mathieu Desnoyers 提交于
      Wire up the rseq system call on x86 32/64.
      
      This provides an ABI improving the speed of a user-space getcpu
      operation on x86 by removing the need to perform a function call, "lsl"
      instruction, or system call on the fast path, as well as improving the
      speed of user-space operations on per-cpu data.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: linux-api@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20180602124408.8430-8-mathieu.desnoyers@efficios.com
      05c17ced
    • M
      x86: Add support for restartable sequences · d6761b8f
      Mathieu Desnoyers 提交于
      Call the rseq_handle_notify_resume() function on return to userspace if
      TIF_NOTIFY_RESUME thread flag is set.
      
      Perform fixup on the pre-signal frame when a signal is delivered on top
      of a restartable sequence critical section.
      
      Check that system calls are not invoked from within rseq critical
      sections by invoking rseq_signal() from syscall_return_slowpath().
      With CONFIG_DEBUG_RSEQ, such behavior results in termination of the
      process with SIGSEGV.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: linux-api@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20180602124408.8430-7-mathieu.desnoyers@efficios.com
      d6761b8f
  5. 15 5月, 2018 3 次提交
  6. 06 5月, 2018 1 次提交
  7. 05 5月, 2018 1 次提交
  8. 03 5月, 2018 1 次提交
    • C
      aio: implement io_pgetevents · 7a074e96
      Christoph Hellwig 提交于
      This is the io_getevents equivalent of ppoll/pselect and allows to
      properly mix signals and aio completions (especially with IOCB_CMD_POLL)
      and atomically executes the following sequence:
      
      	sigset_t origmask;
      
      	pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
      	ret = io_getevents(ctx, min_nr, nr, events, timeout);
      	pthread_sigmask(SIG_SETMASK, &origmask, NULL);
      
      Note that unlike many other signal related calls we do not pass a sigmask
      size, as that would get us to 7 arguments, which aren't easily supported
      by the syscall infrastructure.  It seems a lot less painful to just add a
      new syscall variant in the unlikely case we're going to increase the
      sigset size.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      7a074e96
  9. 27 4月, 2018 1 次提交
    • A
      x86/entry/64/compat: Preserve r8-r11 in int $0x80 · 8bb2610b
      Andy Lutomirski 提交于
      32-bit user code that uses int $80 doesn't care about r8-r11.  There is,
      however, some 64-bit user code that intentionally uses int $0x80 to invoke
      32-bit system calls.  From what I've seen, basically all such code assumes
      that r8-r15 are all preserved, but the kernel clobbers r8-r11.  Since I
      doubt that there's any code that depends on int $0x80 zeroing r8-r11,
      change the kernel to preserve them.
      
      I suspect that very little user code is broken by the old clobber, since
      r8-r11 are only rarely allocated by gcc, and they're clobbered by function
      calls, so they only way we'd see a problem is if the same function that
      invokes int $0x80 also spills something important to one of these
      registers.
      
      The current behavior seems to date back to the historical commit
      "[PATCH] x86-64 merge for 2.6.4".  Before that, all regs were
      preserved.  I can't find any explanation of why this change was made.
      
      Update the test_syscall_vdso_32 testcase as well to verify the new
      behavior, and it strengthens the test to make sure that the kernel doesn't
      accidentally permute r8..r15.
      Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
      8bb2610b
  10. 25 4月, 2018 1 次提交
    • E
      signal: Ensure every siginfo we send has all bits initialized · 3eb0f519
      Eric W. Biederman 提交于
      Call clear_siginfo to ensure every stack allocated siginfo is properly
      initialized before being passed to the signal sending functions.
      
      Note: It is not safe to depend on C initializers to initialize struct
      siginfo on the stack because C is allowed to skip holes when
      initializing a structure.
      
      The initialization of struct siginfo in tracehook_report_syscall_exit
      was moved from the helper user_single_step_siginfo into
      tracehook_report_syscall_exit itself, to make it clear that the local
      variable siginfo gets fully initialized.
      
      In a few cases the scope of struct siginfo has been reduced to make it
      clear that siginfo siginfo is not used on other paths in the function
      in which it is declared.
      
      Instances of using memset to initialize siginfo have been replaced
      with calls clear_siginfo for clarity.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3eb0f519
  11. 10 4月, 2018 1 次提交
  12. 09 4月, 2018 3 次提交
    • D
      syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*() · d5a00528
      Dominik Brodowski 提交于
      This rename allows us to have a coherent syscall stub naming convention on
      64-bit x86 (0xffffffff prefix removed):
      
       810f0af0 t            kernel_waitid	# common (32/64) kernel helper
      
       <inline>            __do_sys_waitid	# inlined helper doing actual work
       810f0be0 t          __se_sys_waitid	# C func calling inlined helper
      
       <inline>     __do_compat_sys_waitid	# inlined helper doing actual work
       810f0d80 t   __se_compat_sys_waitid	# compat C func calling inlined helper
      
       810f2080 T         __x64_sys_waitid	# x64 64-bit-ptregs -> C stub
       810f20b0 T        __ia32_sys_waitid	# ia32 32-bit-ptregs -> C stub[*]
       810f2470 T __ia32_compat_sys_waitid	# ia32 32-bit-ptregs -> compat C stub
       810f2490 T  __x32_compat_sys_waitid	# x32 64-bit-ptregs -> compat C stub
      
          [*] This stub is unused, as the syscall table links
      	__ia32_compat_sys_waitid instead of __ia32_sys_waitid as we need
      	a compat variant here.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180409105145.5364-4-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d5a00528
    • D
      syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention · 5ac9efa3
      Dominik Brodowski 提交于
      Tidy the naming convention for compat syscall subs. Hints which describe
      the purpose of the stub go in front and receive a double underscore to
      denote that they are generated on-the-fly by the COMPAT_SYSCALL_DEFINEx()
      macro.
      
      For the generic case, this means:
      
      t            kernel_waitid	# common C function (see kernel/exit.c)
      
          __do_compat_sys_waitid	# inlined helper doing the actual work
      				# (takes original parameters as declared)
      
      T   __se_compat_sys_waitid	# sign-extending C function calling inlined
      				# helper (takes parameters of type long,
      				# casts them to unsigned long and then to
      				# the declared type)
      
      T        compat_sys_waitid      # alias to __se_compat_sys_waitid()
      				# (taking parameters as declared), to
      				# be included in syscall table
      
      For x86, the naming is as follows:
      
      t            kernel_waitid	# common C function (see kernel/exit.c)
      
          __do_compat_sys_waitid	# inlined helper doing the actual work
      				# (takes original parameters as declared)
      
      t   __se_compat_sys_waitid      # sign-extending C function calling inlined
      				# helper (takes parameters of type long,
      				# casts them to unsigned long and then to
      				# the declared type)
      
      T __ia32_compat_sys_waitid	# IA32_EMULATION 32-bit-ptregs -> C stub,
      				# calls __se_compat_sys_waitid(); to be
      				# included in syscall table
      
      T  __x32_compat_sys_waitid	# x32 64-bit-ptregs -> C stub, calls
      				# __se_compat_sys_waitid(); to be included
      				# in syscall table
      
      If only one of IA32_EMULATION and x32 is enabled, __se_compat_sys_waitid()
      may be inlined into the stub __{ia32,x32}_compat_sys_waitid().
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180409105145.5364-3-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ac9efa3
    • D
      syscalls/core, syscalls/x86: Clean up syscall stub naming convention · e145242e
      Dominik Brodowski 提交于
      Tidy the naming convention for compat syscall subs. Hints which describe
      the purpose of the stub go in front and receive a double underscore to
      denote that they are generated on-the-fly by the SYSCALL_DEFINEx() macro.
      
      For the generic case, this means (0xffffffff prefix removed):
      
       810f08d0 t     kernel_waitid	# common C function (see kernel/exit.c)
      
       <inline>     __do_sys_waitid	# inlined helper doing the actual work
      				# (takes original parameters as declared)
      
       810f1aa0 T   __se_sys_waitid	# sign-extending C function calling inlined
      				# helper (takes parameters of type long;
      				# casts them to the declared type)
      
       810f1aa0 T        sys_waitid	# alias to __se_sys_waitid() (taking
      				# parameters as declared), to be included
      				# in syscall table
      
      For x86, the naming is as follows:
      
       810efc70 t     kernel_waitid	# common C function (see kernel/exit.c)
      
       <inline>     __do_sys_waitid	# inlined helper doing the actual work
      				# (takes original parameters as declared)
      
       810efd60 t   __se_sys_waitid	# sign-extending C function calling inlined
      				# helper (takes parameters of type long;
      				# casts them to the declared type)
      
       810f1140 T __ia32_sys_waitid	# IA32_EMULATION 32-bit-ptregs -> C stub,
      				# calls __se_sys_waitid(); to be included
      				# in syscall table
      
       810f1110 T        sys_waitid	# x86 64-bit-ptregs -> C stub, calls
      				# __se_sys_waitid(); to be included in
      				# syscall table
      
      For x86, sys_waitid() will be re-named to __x64_sys_waitid in a follow-up
      patch.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180409105145.5364-2-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e145242e
  13. 07 4月, 2018 1 次提交
    • M
      kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers · 54a702f7
      Masahiro Yamada 提交于
      GNU Make automatically deletes intermediate files that are updated
      in a chain of pattern rules.
      
      Example 1) %.dtb.o <- %.dtb.S <- %.dtb <- %.dts
      Example 2) %.o <- %.c <- %.c_shipped
      
      A couple of makefiles mark such targets as .PRECIOUS to prevent Make
      from deleting them, but the correct way is to use .SECONDARY.
      
        .SECONDARY
          Prerequisites of this special target are treated as intermediate
          files but are never automatically deleted.
      
        .PRECIOUS
          When make is interrupted during execution, it may delete the target
          file it is updating if the file was modified since make started.
          If you mark the file as precious, make will never delete the file
          if interrupted.
      
      Both can avoid deletion of intermediate files, but the difference is
      the behavior when Make is interrupted; .SECONDARY deletes the target,
      but .PRECIOUS does not.
      
      The use of .PRECIOUS is relatively rare since we do not want to keep
      partially constructed (possibly corrupted) targets.
      
      Another difference is that .PRECIOUS works with pattern rules whereas
      .SECONDARY does not.
      
        .PRECIOUS: $(obj)/%.lex.c
      
      works, but
      
        .SECONDARY: $(obj)/%.lex.c
      
      has no effect.  However, for the reason above, I do not want to use
      .PRECIOUS which could cause obscure build breakage.
      
      The targets specified as .SECONDARY must be explicit.  $(targets)
      contains all targets that need to include .*.cmd files.  So, the
      intermediates you want to keep are mostly in there.  Therefore, mark
      $(targets) as .SECONDARY.  It means primary targets are also marked
      as .SECONDARY, but I do not see any drawback for this.
      
      I replaced some .SECONDARY / .PRECIOUS markers with 'targets'.  This
      will make Kbuild search for non-existing .*.cmd files, but this is
      not a noticeable performance issue.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NFrank Rowand <frowand.list@gmail.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      54a702f7
  14. 05 4月, 2018 5 次提交
    • D
      syscalls/x86: Extend register clearing on syscall entry to lower registers · 6dc936f1
      Dominik Brodowski 提交于
      To reduce the chance that random user space content leaks down the call
      chain in registers, also clear lower registers on syscall entry:
      
      For 64-bit syscalls, extend the register clearing in PUSH_AND_CLEAR_REGS
      to %dx and %cx. This should not hurt at all, also on the other callers
      of that macro. We do not need to clear %rdi and %rsi for syscall entry,
      as those registers are used to pass the parameters to do_syscall_64().
      
      For the 32-bit compat syscalls, do_int80_syscall_32() and
      do_fast_syscall_32() each only take one parameter. Therefore, extend the
      register clearing to %dx, %cx, and %si in entry_SYSCALL_compat and
      entry_INT80_compat.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-8-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6dc936f1
    • D
      syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64 · f8781c4a
      Dominik Brodowski 提交于
      Removing CONFIG_SYSCALL_PTREGS from arch/x86/Kconfig and simply selecting
      ARCH_HAS_SYSCALL_WRAPPER unconditionally on x86-64 allows us to simplify
      several codepaths.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-7-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f8781c4a
    • D
      syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32 · ebeb8c82
      Dominik Brodowski 提交于
      Extend ARCH_HAS_SYSCALL_WRAPPER for i386 emulation and for x32 on 64-bit
      x86.
      
      For x32, all we need to do is to create an additional stub for each
      compat syscall which decodes the parameters in x86-64 ordering, e.g.:
      
      	asmlinkage long __compat_sys_x32_xyzzy(struct pt_regs *regs)
      	{
      		return c_SyS_xyzzy(regs->di, regs->si, regs->dx);
      	}
      
      For i386 emulation, we need to teach compat_sys_*() to take struct
      pt_regs as its only argument, e.g.:
      
      	asmlinkage long __compat_sys_ia32_xyzzy(struct pt_regs *regs)
      	{
      		return c_SyS_xyzzy(regs->bx, regs->cx, regs->dx);
      	}
      
      In addition, we need to create additional stubs for common syscalls
      (that is, for syscalls which have the same parameters on 32-bit and
      64-bit), e.g.:
      
      	asmlinkage long __sys_ia32_xyzzy(struct pt_regs *regs)
      	{
      		return c_sys_xyzzy(regs->bx, regs->cx, regs->dx);
      	}
      
      This approach avoids leaking random user-provided register content down
      the call chain.
      
      This patch is based on an original proof-of-concept
      
       | From: Linus Torvalds <torvalds@linux-foundation.org>
       | Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
      
      and was split up and heavily modified by me, in particular to base it on
      ARCH_HAS_SYSCALL_WRAPPER.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-6-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ebeb8c82
    • D
      syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls · fa697140
      Dominik Brodowski 提交于
      Let's make use of ARCH_HAS_SYSCALL_WRAPPER=y on pure 64-bit x86-64 systems:
      
      Each syscall defines a stub which takes struct pt_regs as its only
      argument. It decodes just those parameters it needs, e.g:
      
      	asmlinkage long sys_xyzzy(const struct pt_regs *regs)
      	{
      		return SyS_xyzzy(regs->di, regs->si, regs->dx);
      	}
      
      This approach avoids leaking random user-provided register content down
      the call chain.
      
      For example, for sys_recv() which is a 4-parameter syscall, the assembly
      now is (in slightly reordered fashion):
      
      	<sys_recv>:
      		callq	<__fentry__>
      
      		/* decode regs->di, ->si, ->dx and ->r10 */
      		mov	0x70(%rdi),%rdi
      		mov	0x68(%rdi),%rsi
      		mov	0x60(%rdi),%rdx
      		mov	0x38(%rdi),%rcx
      
      		[ SyS_recv() is automatically inlined by the compiler,
      		  as it is not [yet] used anywhere else ]
      		/* clear %r9 and %r8, the 5th and 6th args */
      		xor	%r9d,%r9d
      		xor	%r8d,%r8d
      
      		/* do the actual work */
      		callq	__sys_recvfrom
      
      		/* cleanup and return */
      		cltq
      		retq
      
      The only valid place in an x86-64 kernel which rightfully calls
      a syscall function on its own -- vsyscall -- needs to be modified
      to pass struct pt_regs onwards as well.
      
      To keep the syscall table generation working independent of
      SYSCALL_PTREGS being enabled, the stubs are named the same as the
      "original" syscall stubs, i.e. sys_*().
      
      This patch is based on an original proof-of-concept
      
       | From: Linus Torvalds <torvalds@linux-foundation.org>
       | Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
      
      and was split up and heavily modified by me, in particular to base it on
      ARCH_HAS_SYSCALL_WRAPPER, to limit it to 64-bit-only for the time being,
      and to update the vsyscall to the new calling convention.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-4-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fa697140
    • L
      x86/syscalls: Don't pointlessly reload the system call number · dfe64506
      Linus Torvalds 提交于
      We have it in a register in the low-level asm, just pass it in as an
      argument rather than have do_syscall_64() load it back in from the
      ptregs pointer.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-2-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dfe64506
  15. 03 4月, 2018 2 次提交
  16. 24 3月, 2018 1 次提交
  17. 20 3月, 2018 2 次提交
  18. 08 3月, 2018 1 次提交
  19. 07 3月, 2018 5 次提交
  20. 28 2月, 2018 1 次提交
  21. 21 2月, 2018 5 次提交