1. 21 10月, 2008 1 次提交
  2. 14 10月, 2008 1 次提交
    • S
      ftrace: x86 mcount stub · 0a37605c
      Steven Rostedt 提交于
      x86 now sets up the mcount locations through the build and no longer
      needs to record the ip when the function is executed. This patch changes
      the initial mcount to simply return. There's no need to do any other work.
      If the ftrace start up test fails, the original mcount will be what everything
      will use, so having this as fast as possible is a good thing.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a37605c
  3. 13 10月, 2008 4 次提交
  4. 24 7月, 2008 4 次提交
    • A
      x86, 64-bit, dwarf2: push pushes 8 bytes and popf pops 8 · e0a5a5d9
      Alexander van Heukelum 提交于
      The CFI_ADJUST_CFA_OFFSET dwarf2 annotation of a push/popf
      pair in ret_from_fork wrongly used a value of 4. It should
      have been 8. Fix that.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: heukelum@fastmail.fm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0a5a5d9
    • R
      x86_64 ia32 syscall audit fast-path · 5cbf1565
      Roland McGrath 提交于
      This adds fast paths for 32-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      These paths does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      5cbf1565
    • R
      x86_64 syscall audit fast-path · 86a1c34a
      Roland McGrath 提交于
      This adds a fast path for 64-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      This path does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      86a1c34a
    • R
      x86_64: remove bogus optimization in sysret_signal · 15e8f348
      Roland McGrath 提交于
      This short-circuit path in sysret_signal looks wrong to me.
      AFAICT, in practice the branch is never taken--and if it were,
      it would go wrong.  To wit, try loading a module whose init
      function does set_thread_flag(TIF_IRET), and see insmod crash
      (presumably with a wrong user stack pointer).
      
      This is because the FIXUP_TOP_OF_STACK work hasn't been done yet
      when we jump around the call to ptregscall_common and get to
      int_with_check--where it expects the user RSP,SS,CS and EFLAGS to
      have been stored by FIXUP_TOP_OF_STACK.
      
      I don't think it's normally possible to get to sysret_signal with no
      _TIF_DO_NOTIFY_MASK bits set anyway, so these two instructions are
      already superfluous.  If it ever did happen, it is harmless to call
      do_notify_resume with nothing for it to do.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      15e8f348
  5. 17 7月, 2008 1 次提交
    • R
      x86 ptrace: unify syscall tracing · d4d67150
      Roland McGrath 提交于
      This unifies and cleans up the syscall tracing code on i386 and x86_64.
      
      Using a single function for entry and exit tracing on 32-bit made the
      do_syscall_trace() into some terrible spaghetti.  The logic is clear and
      simple using separate syscall_trace_enter() and syscall_trace_leave()
      functions as on 64-bit.
      
      The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
      on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
      tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.
      
      Changing syscall_trace_enter() to return the syscall number shortens
      all the assembly paths, while adding the SYSEMU feature in a simple way.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      d4d67150
  6. 16 7月, 2008 3 次提交
  7. 12 7月, 2008 1 次提交
    • R
      x86_64: fix delayed signals · eca91e78
      Roland McGrath 提交于
      On three of the several paths in entry_64.S that call
      do_notify_resume() on the way back to user mode, we fail to properly
      check again for newly-arrived work that requires another call to
      do_notify_resume() before going to user mode.  These paths set the
      mask to check only _TIF_NEED_RESCHED, but this is wrong.  The other
      paths that lead to do_notify_resume() do this correctly already, and
      entry_32.S does it correctly in all cases.
      
      All paths back to user mode have to check all the _TIF_WORK_MASK
      flags at the last possible stage, with interrupts disabled.
      Otherwise, we miss any flags (TIF_SIGPENDING for example) that were
      set any time after we entered do_notify_resume().  More work flags
      can be set (or left set) synchronously inside do_notify_resume(), as
      TIF_SIGPENDING can be, or asynchronously by interrupts or other CPUs
      (which then send an asynchronous interrupt).
      
      There are many different scenarios that could hit this bug, most of
      them races.  The simplest one to demonstrate does not require any
      race: when one signal has done handler setup at the check before
      returning from a syscall, and there is another signal pending that
      should be handled.  The second signal's handler should interrupt the
      first signal handler before it actually starts (so the interrupted PC
      is still at the handler's entry point).  Instead, it runs away until
      the next kernel entry (next syscall, tick, etc).
      
      This test behaves correctly on 32-bit kernels, and fails on 64-bit
      (either 32-bit or 64-bit test binary).  With this fix, it works.
      
          #define _GNU_SOURCE
          #include <stdio.h>
          #include <signal.h>
          #include <string.h>
          #include <sys/ucontext.h>
      
          #ifndef REG_RIP
          #define REG_RIP REG_EIP
          #endif
      
          static sig_atomic_t hit1, hit2;
      
          static void
          handler (int sig, siginfo_t *info, void *ctx)
          {
            ucontext_t *uc = ctx;
      
            if ((void *) uc->uc_mcontext.gregs[REG_RIP] == &handler)
              {
                if (sig == SIGUSR1)
                  hit1 = 1;
                else
                  hit2 = 1;
              }
      
            printf ("%s at %#lx\n", strsignal (sig),
                    uc->uc_mcontext.gregs[REG_RIP]);
          }
      
          int
          main (void)
          {
            struct sigaction sa;
            sigset_t set;
      
            sigemptyset (&sa.sa_mask);
            sa.sa_flags = SA_SIGINFO;
            sa.sa_sigaction = &handler;
      
            if (sigaction (SIGUSR1, &sa, NULL)
                || sigaction (SIGUSR2, &sa, NULL))
              return 2;
      
            sigemptyset (&set);
            sigaddset (&set, SIGUSR1);
            sigaddset (&set, SIGUSR2);
            if (sigprocmask (SIG_BLOCK, &set, NULL))
              return 3;
      
            printf ("main at %p, handler at %p\n", &main, &handler);
      
            raise (SIGUSR1);
            raise (SIGUSR2);
      
            if (sigprocmask (SIG_UNBLOCK, &set, NULL))
              return 4;
      
            if (hit1 + hit2 == 1)
              {
                puts ("PASS");
                return 0;
              }
      
            puts ("FAIL");
            return 1;
          }
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eca91e78
  8. 09 7月, 2008 1 次提交
  9. 08 7月, 2008 7 次提交
  10. 27 6月, 2008 1 次提交
    • V
      x86: don't destroy %rbp on kernel-mode faults · 9d8ad5d6
      Vegard Nossum 提交于
      From the code:
      
          "B stepping K8s sometimes report an truncated RIP for IRET exceptions
          returning to compat mode. Check for these here too."
      
      The code then proceeds to truncate the upper 32 bits of %rbp. This means
      that when do_page_fault() is finally called, its prologue,
      
          do_page_fault:
              push %rbp
              movl %rsp, %rbp
      
      will put the truncated base pointer on the stack. This means that the
      stack tracer will not be able to follow the base-pointer changes and
      will see all subsequent stack frames as unreliable.
      
      This patch changes the code to use a different register (%rcx) for the
      checking and leaves %rbp untouched.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9d8ad5d6
  11. 26 6月, 2008 1 次提交
  12. 24 6月, 2008 1 次提交
  13. 19 6月, 2008 1 次提交
  14. 25 5月, 2008 1 次提交
  15. 24 5月, 2008 2 次提交
    • S
      ftrace: use dynamic patching for updating mcount calls · d61f82d0
      Steven Rostedt 提交于
      This patch replaces the indirect call to the mcount function
      pointer with a direct call that will be patched by the
      dynamic ftrace routines.
      
      On boot up, the mcount function calls the ftace_stub function.
      When the dynamic ftrace code is initialized, the ftrace_stub
      is replaced with a call to the ftrace_record_ip, which records
      the instruction pointers of the locations that call it.
      
      Later, the ftraced daemon will call kstop_machine and patch all
      the locations to nops.
      
      When a ftrace is enabled, the original calls to mcount will now
      be set top call ftrace_caller, which will do a direct call
      to the registered ftrace function. This direct call is also patched
      when the function that should be called is updated.
      
      All patching is performed by a kstop_machine routine to prevent any
      type of race conditions that is associated with modifying code
      on the fly.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d61f82d0
    • A
      ftrace: add basic support for gcc profiler instrumentation · 16444a8a
      Arnaldo Carvalho de Melo 提交于
      If CONFIG_FTRACE is selected and /proc/sys/kernel/ftrace_enabled is
      set to a non-zero value the ftrace routine will be called everytime
      we enter a kernel function that is not marked with the "notrace"
      attribute.
      
      The ftrace routine will then call a registered function if a function
      happens to be registered.
      
      [ This code has been highly hacked by Steven Rostedt and Ingo Molnar,
        so don't blame Arnaldo for all of this ;-) ]
      
      Update:
        It is now possible to register more than one ftrace function.
        If only one ftrace function is registered, that will be the
        function that ftrace calls directly. If more than one function
        is registered, then ftrace will call a function that will loop
        through the functions to call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      16444a8a
  16. 17 4月, 2008 1 次提交
    • R
      x86: ptrace vs -ENOSYS · a31f8dd7
      Roland McGrath 提交于
      When we're stopped at syscall entry tracing, ptrace can change the %rax
      value from -ENOSYS to something else.  If no system call is actually made
      because the syscall number (now in orig_rax) is bad, then we now always
      reset %rax to -ENOSYS again.
      
      This changes it to leave the return value alone after entry tracing.
      That way, the %rax value set by ptrace is there to be seen in user mode
      (or in syscall exit tracing).  This is consistent with what the 32-bit
      kernel does.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a31f8dd7
  17. 26 2月, 2008 1 次提交
    • I
      x86: fix execve with -fstack-protect · 5d119b2c
      Ingo Molnar 提交于
      pointed out by pageexec@freemail.hu:
      
      > what happens here is that gcc treats the argument area as owned by the
      > callee, not the caller and is allowed to do certain tricks. for ssp it
      > will make a copy of the struct passed by value into the local variable
      > area and pass *its* address down, and it won't copy it back into the
      > original instance stored in the argument area.
      >
      > so once sys_execve returns, the pt_regs passed by value hasn't at all
      > changed and its default content will cause a nice double fault (FWIW,
      > this part took me the longest to debug, being down with cold didn't
      > help it either ;).
      
      To fix this we pass in pt_regs by pointer.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5d119b2c
  18. 19 2月, 2008 1 次提交
  19. 10 2月, 2008 1 次提交
  20. 07 2月, 2008 2 次提交
  21. 30 1月, 2008 1 次提交
  22. 26 1月, 2008 1 次提交
    • P
      sched: high-res preemption tick · 8f4d37ec
      Peter Zijlstra 提交于
      Use HR-timers (when available) to deliver an accurate preemption tick.
      
      The regular scheduler tick that runs at 1/HZ can be too coarse when nice
      level are used. The fairness system will still keep the cpu utilisation 'fair'
      by then delaying the task that got an excessive amount of CPU time but try to
      minimize this by delivering preemption points spot-on.
      
      The average frequency of this extra interrupt is sched_latency / nr_latency.
      Which need not be higher than 1/HZ, its just that the distribution within the
      sched_latency period is important.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f4d37ec
  23. 18 10月, 2007 1 次提交
  24. 12 10月, 2007 1 次提交