1. 18 1月, 2012 2 次提交
    • E
      audit: inline audit_syscall_entry to reduce burden on archs · b05d8447
      Eric Paris 提交于
      Every arch calls:
      
      if (unlikely(current->audit_context))
      	audit_syscall_entry()
      
      which requires knowledge about audit (the existance of audit_context) in
      the arch code.  Just do it all in static inline in audit.h so that arch's
      can remain blissfully ignorant.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b05d8447
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
  2. 01 12月, 2011 2 次提交
    • M
      [S390] remove reset of system call restart on psw changes · cfc9066b
      Martin Schwidefsky 提交于
      git commit 20b40a79 "signal race with restarting system calls"
      added code to the poke_user/poke_user_compat to reset the system call
      restart information in the thread-info if the PSW address is changed.
      The purpose of that change has been to workaround old gdbs that do
      not know about the REGSET_SYSTEM_CALL. It turned out that this is not
      a good idea, it makes the behaviour of the debuggee dependent on the
      order of specific ptrace call, e.g. the REGSET_SYSTEM_CALL register
      set needs to be written last. And the workaround does not really fix
      old gdbs, inferior calls on interrupted restarting system calls do not
      work either way.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      cfc9066b
    • M
      [S390] add missing .set function for NT_S390_LAST_BREAK regset · b934069c
      Martin Schwidefsky 提交于
      The last breaking event address is a read-only value, the regset misses the
      .set function. If a PTRACE_SETREGSET is done for NT_S390_LAST_BREAK we
      get an oops due to a branch to zero:
      
      Kernel BUG at 0000000000000002 verbose debug info unavailable
      illegal operation: 0001 #1 SMP
      ...
      Call Trace:
      (<0000000000158294> ptrace_regset+0x184/0x188)
       <00000000001595b6> ptrace_request+0x37a/0x4fc
       <0000000000109a78> arch_ptrace+0x108/0x1fc
       <00000000001590d6> SyS_ptrace+0xaa/0x12c
       <00000000005c7a42> sysc_noemu+0x16/0x1c
       <000003fffd5ec10c> 0x3fffd5ec10c
      Last Breaking-Event-Address:
       <0000000000158242> ptrace_regset+0x132/0x188
      
      Add a nop .set function to prevent the branch to zero.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: stable@kernel.org
      b934069c
  3. 30 10月, 2011 5 次提交
    • M
      [S390] allow all addressing modes · d4e81b35
      Martin Schwidefsky 提交于
      The user space program can change its addressing mode between the
      24-bit, 31-bit and the 64-bit mode if the kernel is 64 bit. Currently
      the kernel always forces the standard amode on signal delivery and
      signal return and on ptrace: 64-bit for a 64-bit process, 31-bit for
      a compat process and 31-bit kernels. Change the signal and ptrace code
      to allow the full range of addressing modes. Signal handlers are
      run in the standard addressing mode for the process.
      
      One caveat is that even an 31-bit compat process can switch to the
      64-bit mode. The next signal will switch back into the 31-bit mode
      and there is no room in the 31-bit compat signal frame to store the
      information that the program came from the 64-bit mode.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      d4e81b35
    • M
      [S390] cleanup psw related bits and pieces · b50511e4
      Martin Schwidefsky 提交于
      Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS
      to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function
      enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros.
      Change psw_kernel_bits / psw_user_bits to contain only the bits that
      are always set in the respective mode.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b50511e4
    • M
      [S390] add TIF_SYSCALL thread flag · b6ef5bb3
      Martin Schwidefsky 提交于
      Add an explicit TIF_SYSCALL bit that indicates if a task is inside
      a system call. The svc_code in the pt_regs structure is now only
      valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
      can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
      readable and it saves a few lines of code.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b6ef5bb3
    • M
      [S390] signal race with restarting system calls · 20b40a79
      Martin Schwidefsky 提交于
      For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
      do_signal will prepare the restart of the system call with a rewind of
      the PSW before calling get_signal_to_deliver (where the debugger might
      take control). For A ERESTART_RESTARTBLOCK restarting system call
      do_signal will set -EINTR as return code.
      There are two issues with this approach:
      1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
         ERESTART_RESTARTBLOCK as the rewinding already took place or the
         return code has been changed to -EINTR
      2) if get_signal_to_deliver does not return with a signal to deliver
         the restart via the repeat of the svc instruction is left in place.
         This opens a race if another signal is made pending before the
         system call instruction can be reexecuted. The original system call
         will be restarted even if the second signal would have ended the
         system call with -EINTR.
      
      These two issues can be solved by dropping the early rewind of the
      system call before get_signal_to_deliver has been called and by using
      the TIF_RESTART_SVC magic to do the restart if no signal has to be
      delivered. The only situation where the system call restart via the
      repeat of the svc instruction is appropriate is when a SA_RESTART
      signal is delivered to user space.
      
      Unfortunately this breaks inferior calls by the debugger again. The
      system call number and the length of the system call instruction is
      lost over the inferior call and user space will see ERESTARTNOHAND/
      ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
      new ptrace interface is added to save/restore the system call number
      and system call instruction length.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      20b40a79
    • M
      [S390] user per registers vs. ptrace single stepping · a45aff52
      Martin Schwidefsky 提交于
      git commit 5e9a2692 "[S390] ptrace cleanup" introduced a regression
      for the case when both a user PER set (e.g. a storage alteration trace) and
      PTRACE_SINGLESTEP are active. The new code will overrule the user PER set
      with a instruction-fetch PER set over the whole address space for ptrace
      single stepping. The inferior process will be stopped after each instruction
      with an instruction fetch event. Any other events that may have occurred
      concurrently are not reported (e.g. storage alteration event) because the
      control bits for them are not set. The solution is to merge the PER control
      bits of the user PER set with the PER_EVENT_IFETCH control bit for
      PTRACE_SINGLESTEP.
      
      Cc: stable@kernel.org
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a45aff52
  4. 05 1月, 2011 1 次提交
  5. 28 10月, 2010 1 次提交
  6. 17 5月, 2010 1 次提交
  7. 12 5月, 2010 1 次提交
  8. 17 2月, 2010 1 次提交
  9. 14 1月, 2010 1 次提交
  10. 19 12月, 2009 1 次提交
  11. 06 10月, 2009 1 次提交
  12. 23 9月, 2009 1 次提交
  13. 26 8月, 2009 3 次提交
    • J
      tracing: Create generic syscall TRACE_EVENTs · 1c569f02
      Josh Stone 提交于
      This converts the syscall_enter/exit tracepoints into TRACE_EVENTs, so
      you can have generic ftrace events that capture all system calls with
      arguments and return values.  These generic events are also renamed to
      sys_enter/exit, so they're more closely aligned to the specific
      sys_enter_foo events.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-5-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      1c569f02
    • J
      tracing: Move tracepoint callbacks from declaration to definition · 97419875
      Josh Stone 提交于
      It's not strictly correct for the tracepoint reg/unreg callbacks to
      occur when a client is hooking up, because the actual tracepoint may not
      be present yet.  This happens to be fine for syscall, since that's in
      the core kernel, but it would cause problems for tracepoints defined in
      a module that hasn't been loaded yet.  It also means the reg/unreg has
      to be EXPORTed for any modules to use the tracepoint (as in SystemTap).
      
      This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
      DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
      callbacks are used now when the active state of the tracepoint changes
      in set_tracepoint & disable_tracepoint.
      
      This also introduces TRACE_EVENT_FN, so ftrace events can also provide
      registration callbacks if needed.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      97419875
    • J
      tracing: Rename FTRACE_SYSCALLS for tracepoints · 66700001
      Josh Stone 提交于
      s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g
      s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g
      
      The syscall enter/exit tracing is no longer specific to just ftrace, so
      they now have names that reflect their tie to tracepoints instead.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      66700001
  14. 19 8月, 2009 1 次提交
    • I
      [S390] ftrace: update system call tracer support · 5e9ad7df
      Ingo Molnar 提交于
      Commit fb34a08c ("tracing: Add trace events for each syscall
      entry/exit") changed the lowlevel API to ftrace syscall tracing
      but did not update s390 which started making use of it recently.
      
      This broke the s390 build, as reported by Paul Mundt.
      
      Update the callbacks with the syscall number and the syscall
      return code values. This allows per syscall tracepoints,
      syscall argument enumeration /debug/tracing/events/syscalls/
      and perfcounters support and integration on s390 too.
      Reported-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <tip-fb34a08c@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e9ad7df
  15. 13 7月, 2009 1 次提交
  16. 12 6月, 2009 3 次提交
  17. 25 12月, 2008 1 次提交
    • M
      [S390] remove ptrace warning on 31 bit. · 547e3cec
      Martin Schwidefsky 提交于
      A kernel compile on 31 bit gives the following warnings in ptrace.c:
      
      arch/s390/kernel/ptrace.c: In function 'peek_user':
      arch/s390/kernel/ptrace.c:207: warning: unused variable 'dummy'
      arch/s390/kernel/ptrace.c: In function 'poke_user':
      arch/s390/kernel/ptrace.c:315: warning: unused variable 'dummy'
      
      Getting rid of the dummy variables removes the warnings.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      547e3cec
  18. 27 11月, 2008 1 次提交
    • M
      [S390] fix system call parameter functions. · 59da2139
      Martin Schwidefsky 提交于
      syscall_get_nr() currently returns a valid result only if the call
      chain of the traced process includes do_syscall_trace_enter(). But
      collect_syscall() can be called for any sleeping task, the result of
      syscall_get_nr() in general is completely bogus.
      
      To make syscall_get_nr() work for any sleeping task the traps field
      in pt_regs is replace with svcnr - the system call number the process
      is executing. If svcnr == 0 the process is not on a system call path.
      
      The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
      for the first system call parameter. This is incorrect since gprs[2]
      may have been overwritten with the system call number if the call
      chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      59da2139
  19. 11 10月, 2008 1 次提交
  20. 09 9月, 2008 1 次提交
  21. 14 7月, 2008 1 次提交
  22. 07 5月, 2008 1 次提交
  23. 30 4月, 2008 2 次提交
  24. 17 4月, 2008 1 次提交
  25. 26 1月, 2008 1 次提交
  26. 17 10月, 2007 1 次提交
  27. 18 7月, 2007 2 次提交
  28. 06 2月, 2007 1 次提交
    • G
      [S390] noexec protection · c1821c2e
      Gerald Schaefer 提交于
      This provides a noexec protection on s390 hardware. Our hardware does
      not have any bits left in the pte for a hw noexec bit, so this is a
      different approach using shadow page tables and a special addressing
      mode that allows separate address spaces for code and data.
      
      As a special feature of our "secondary-space" addressing mode, separate
      page tables can be specified for the translation of data addresses
      (storage operands) and instruction addresses. The shadow page table is
      used for the instruction addresses and the standard page table for the
      data addresses.
      The shadow page table is linked to the standard page table by a pointer
      in page->lru.next of the struct page corresponding to the page that
      contains the standard page table (since page->private is not really
      private with the pte_lock and the page table pages are not in the LRU
      list).
      Depending on the software bits of a pte, it is either inserted into
      both page tables or just into the standard (data) page table. Pages of
      a vma that does not have the VM_EXEC bit set get mapped only in the
      data address space. Any try to execute code on such a page will cause a
      page translation exception. The standard reaction to this is a SIGSEGV
      with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn)
      and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the
      kernel to the signal stack frame. Unfortunately, the signal return
      mechanism cannot be modified to use an SA_RESTORER because the
      exception unwinding code depends on the system call opcode stored
      behind the signal stack frame.
      
      This feature requires that user space is executed in secondary-space
      mode and the kernel in home-space mode, which means that the addressing
      modes need to be switched and that the noexec protection only works
      for user space.
      After switching the addressing modes, we cannot use the mvcp/mvcs
      instructions anymore to copy between kernel and user space. A new
      mvcos instruction has been added to the z9 EC/BC hardware which allows
      to copy between arbitrary address spaces, but on older hardware the
      page tables need to be walked manually.
      Signed-off-by: NGerald Schaefer <geraldsc@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c1821c2e