1. 18 6月, 2009 1 次提交
    • A
      i386: fix return to 16-bit stack from NMI handler · 2e04bc76
      Alexander van Heukelum 提交于
      Returning to a task with a 16-bit stack requires special care: the iret
      instruction does not restore the high word of esp in that case. The
      espfix code fixes this, but currently is not invoked on NMIs. This means
      that a running task gets the upper word of esp clobbered due intervening
      NMIs. To reproduce, compile and run the following program with the nmi
      watchdog enabled (nmi_watchdog=2 on the command line). Using gdb you can
      see that the high bits of esp contain garbage, while the low bits are
      still correct.
      
      This patch puts the espfix code back into the NMI code path.
      
      The patch is slightly complicated due to the irqtrace infrastructure not
      being NMI-safe. The NMI return path cannot call TRACE_IRQS_IRET.
      Otherwise, the tail of the normal iret-code is correct for the nmi code
      path too. To be able to share this code-path, the TRACE_IRQS_IRET was
      move up a bit. The espfix code exists after the TRACE_IRQS_IRET, but
      this code explicitly disables interrupts. This short interrupts-off
      section is now not traced anymore. The return-to-kernel path now always
      includes the preliminary test to decide if the espfix code should be
      called. This is never the case, but doing it this way keeps the patch as
      simple as possible and the few extra instructions should not affect
      timing in any significant way.
      
       #define _GNU_SOURCE
       #include <stdio.h>
       #include <sys/types.h>
       #include <sys/mman.h>
       #include <unistd.h>
       #include <sys/syscall.h>
       #include <asm/ldt.h>
      
      int modify_ldt(int func, void *ptr, unsigned long bytecount)
      {
              return syscall(SYS_modify_ldt, func, ptr, bytecount);
      }
      
      /* this is assumed to be usable */
       #define SEGBASEADDR 0x10000
       #define SEGLIMIT 0x20000
      
      /* 16-bit segment */
      struct user_desc desc = {
              .entry_number = 0,
              .base_addr = SEGBASEADDR,
              .limit = SEGLIMIT,
              .seg_32bit = 0,
              .contents = 0, /* ??? */
              .read_exec_only = 0,
              .limit_in_pages = 0,
              .seg_not_present = 0,
              .useable = 1
      };
      
      int main(void)
      {
              setvbuf(stdout, NULL, _IONBF, 0);
      
              /* map a 64 kb segment */
              char *pointer = mmap((void *)SEGBASEADDR, SEGLIMIT+1,
                              PROT_EXEC|PROT_READ|PROT_WRITE,
                              MAP_SHARED|MAP_ANONYMOUS, -1, 0);
              if (pointer == NULL) {
                      printf("could not map space\n");
                      return 0;
              }
      
              /* write ldt, new mode */
              int err = modify_ldt(0x11, &desc, sizeof(desc));
              if (err) {
                      printf("error modifying ldt: %i\n", err);
                      return 0;
              }
      
              for (int i=0; i<1000; i++) {
              asm volatile (
                      "pusha\n\t"
                      "mov %ss, %eax\n\t" /* preserve ss:esp */
                      "mov %esp, %ebp\n\t"
                      "push $7\n\t" /* index 0, ldt, user mode */
                      "push $65536-4096\n\t" /* esp */
                      "lss (%esp), %esp\n\t" /* switch to new stack */
                      "push %eax\n\t" /* save old ss:esp on new stack */
                      "push %ebp\n\t"
                      "add $17*65536, %esp\n\t" /* set high bits */
                      "mov %esp, %edx\n\t"
      
                      "mov $10000000, %ecx\n\t" /* wait... */
                      "1: loop 1b\n\t" /* ... a bit */
      
                      "cmp %esp, %edx\n\t"
                      "je 1f\n\t"
                      "ud2\n\t" /* esp changed inexplicably! */
                      "1:\n\t"
                      "sub $17*65536, %esp\n\t" /* restore high bits */
                      "lss (%esp), %esp\n\t" /* restore old ss:esp */
                      "popa\n\t");
      
                      printf("\rx%ix", i);
              }
      
              return 0;
      }
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Acked-by: NStas Sergeev <stsp@aknet.ru>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2e04bc76
  2. 14 3月, 2009 1 次提交
    • J
      x86: entry_32.S fix compile warnings - fix work mask bit width · 88200bc2
      Jaswinder Singh Rajput 提交于
      Fix:
      
       arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1
       arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff
       arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1
       arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff
       arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091
      
      TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the
      first 16 bits of the work mask - bit 27 falls outside of that.
      
      Update the entry_32.S code to check the full 32-bit mask.
      
      [ %cx => %ecx fix from Cyrill Gorcunov <gorcunov@gmail.com> ]
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "H. Peter Anvin" <hpa@kernel.org>
      LKML-Reference: <1237012693.18733.3.camel@ht.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88200bc2
  3. 24 2月, 2009 1 次提交
    • S
      x86: minor cleanup in the espfix code · bda3a897
      Stas Sergeev 提交于
      Impact: Cleanup
      
      Checkin be44d2aa eliminates the use of
      a 16-bit stack for espfix.  However, at least one instruction remained
      that only operated on the low 16 bits of %esp.
      
      This is not a bug per se because the kernel stack is always an aligned
      4K or 8K block.  Therefore it cannot cross 64K boundaries; this code,
      in fact, relies strictly on that fact.
      
      However, it's a lot cleaner (and, for that matter, smaller) to operate
      on the entire 32-bit register.
      Signed-off-by: NStas Sergeev <stsp@aknet.ru>
      CC: Zachary Amsden <zach@vmware.com>
      CC: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      bda3a897
  4. 14 2月, 2009 1 次提交
  5. 11 2月, 2009 1 次提交
  6. 10 2月, 2009 3 次提交
    • T
      x86: implement x86_32 stack protector · 60a5317f
      Tejun Heo 提交于
      Impact: stack protector for x86_32
      
      Implement stack protector for x86_32.  GDT entry 28 is used for it.
      It's set to point to stack_canary-20 and have the length of 24 bytes.
      CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
      to the stack canary segment on entry.  As %gs is otherwise unused by
      the kernel, the canary can be anywhere.  It's defined as a percpu
      variable.
      
      x86_32 exception handlers take register frame on stack directly as
      struct pt_regs.  With -fstack-protector turned on, gcc copies the
      whole structure after the stack canary and (of course) doesn't copy
      back on return thus losing all changed.  For now, -fno-stack-protector
      is added to all files which contain those functions.  We definitely
      need something better.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60a5317f
    • T
      x86: make lazy %gs optional on x86_32 · ccbeed3a
      Tejun Heo 提交于
      Impact: pt_regs changed, lazy gs handling made optional, add slight
              overhead to SAVE_ALL, simplifies error_code path a bit
      
      On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
      doesn't have place for it and gs is saved/loaded only when necessary.
      In preparation for stack protector support, this patch makes lazy %gs
      handling optional by doing the followings.
      
      * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.
      
      * Save and restore %gs along with other registers in entry_32.S unless
        LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
        even when LAZY_GS.  However, it adds no overhead to common exit path
        and simplifies entry path with error code.
      
      * Define different user_gs accessors depending on LAZY_GS and add
        lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
        lazy_*_gs() ops are used to save, load and clear %gs lazily.
      
      * Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.
      
      xen and lguest changes need to be verified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ccbeed3a
    • T
      x86: use asm .macro instead of cpp #define in entry_32.S · f0d96110
      Tejun Heo 提交于
      Impact: cleanup
      
      Use .macro instead of cpp #define where approriate.  This cleans up
      code and will ease future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f0d96110
  7. 29 1月, 2009 1 次提交
  8. 21 1月, 2009 1 次提交
    • T
      x86: make x86_32 use tlb_64.c · 02cf94c3
      Tejun Heo 提交于
      Impact: less contention when issuing invalidate IPI, cleanup
      
      Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
      multiple IPI vectors for tlb shootdown to reduce contention.  This
      patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
      code paths.
      
      Note that the usage of asmlinkage is inconsistent for x86_32 and 64
      and calls for further cleanup.  This has been noted with a FIXME
      comment in tlb_64.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      02cf94c3
  9. 13 1月, 2009 1 次提交
  10. 03 12月, 2008 2 次提交
  11. 27 11月, 2008 1 次提交
  12. 26 11月, 2008 3 次提交
  13. 24 11月, 2008 1 次提交
  14. 16 11月, 2008 1 次提交
    • F
      tracing/function-return-tracer: support for dynamic ftrace on function return tracer · e7d3737e
      Frederic Weisbecker 提交于
      This patch adds the support for dynamic tracing on the function return tracer.
      The whole difference with normal dynamic function tracing is that we don't need
      to hook on a particular callback. The only pro that we want is to nop or set
      dynamically the calls to ftrace_caller (which is ftrace_return_caller here).
      
      Some security checks ensure that we are not trying to launch dynamic tracing for
      return tracing while normal function tracing is already running.
      
      An example of trace with getnstimeofday set as a filter:
      
      ktime_get_ts+0x22/0x50 -> getnstimeofday (2283 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1396 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1825 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1426 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1524 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1434 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1502 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1404 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1397 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1051 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1314 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1344 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1163 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1390 ns)
      ktime_get_ts+0x22/0x50 -> getnstimeofday (1374 ns)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7d3737e
  15. 13 11月, 2008 2 次提交
  16. 12 11月, 2008 2 次提交
    • H
      x86: 32 bits: shrink and align IRQ stubs · b7c6244f
      H. Peter Anvin 提交于
      Shrink the IRQ stubs on 32 bits down to just over four bytes per (we
      fit seven into a 32-byte chunk.)  This shrinks the total icache
      consumption of the IRQ stubs down to an even kilobyte, if all of them
      are in active use.
      
      The downside is that we end up with a double jump, which could have a
      negative effect on some pipelines.  The double jump is always inside
      the same cacheline on any modern chips (the exception being
      486/Elan/Geode which have only 16-byte cachelines, but are unlikely to
      have too many interrupt sources.)
      
      To get the most effect, cache-align the IRQ stubs.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b7c6244f
    • H
      x86: 32 bit: interrupt stub consistency with 64 bit · 4687518c
      H. Peter Anvin 提交于
      Don't generate interrupt stubs for interrupt vectors below
      FIRST_EXTERNAL_VECTOR, and make the table of interrupt vectors
      (interrupt[]) __initconst.  Both of these changes both conserve memory
      and improve consistency with 64 bits.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4687518c
  17. 11 11月, 2008 1 次提交
    • F
      tracing, x86: add low level support for ftrace return tracing · caf4b323
      Frederic Weisbecker 提交于
      Impact: add infrastructure for function-return tracing
      
      Add low level support for ftrace return tracing.
      
      This plug-in stores return addresses on the thread_info structure of
      the current task.
      
      The index of the current return address is initialized when the task
      is the first one (init) and when a process forks (the child). It is
      not needed when a task does a sys_execve because after this syscall,
      it still needs to return on the kernel functions it called.
      
      Note that the code of return_to_handler has been suggested by Steven
      Rostedt as almost all of the ideas of improvements in this V3.
      
      For purpose of security, arch/x86/kernel/process_32.c is not traced
      because __switch_to() changes the current task during its execution.
      That could cause inconsistency in the stored return address of this
      function even if I didn't have any crash after testing with tracing on
      this function enabled.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      caf4b323
  18. 06 11月, 2008 1 次提交
    • S
      ftrace: add quick function trace stop · 60a7ecf4
      Steven Rostedt 提交于
      Impact: quick start and stop of function tracer
      
      This patch adds a way to disable the function tracer quickly without
      the need to run kstop_machine. It adds a new variable called
      function_trace_stop which will stop the calls to functions from mcount
      when set.  This is just an on/off switch and does not handle recursion
      like preempt_disable().
      
      It's main purpose is to help other tracers/debuggers start and stop tracing
      fuctions without the need to call kstop_machine.
      
      The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs
      that implement the testing of the function_trace_stop in the mcount
      arch dependent code. Otherwise, the test is done in the C code.
      
      x86 is the only arch at the moment that supports this.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60a7ecf4
  19. 22 10月, 2008 1 次提交
  20. 21 10月, 2008 1 次提交
  21. 16 10月, 2008 1 次提交
    • Y
      x86: make 32bit support per_cpu vector · 497c9a19
      Yinghai Lu 提交于
      so we can merge io_apic_32.c and io_apic_64.c
      
      v2: Use cpu_online_map as target cpus for bigsmp, just like 64-bit is doing.
      
      Also remove some unused TARGET_CPUS macro.
      
      v3: need to check if desc is null in smp_irq_move_cleanup
      
      also migration needs to reset vector too, so copy __target_IO_APIC_irq
      from 64bit.
      
      (the duplication will go away once the two files are unified.)
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      497c9a19
  22. 14 10月, 2008 1 次提交
    • S
      ftrace: x86 mcount stub · 0a37605c
      Steven Rostedt 提交于
      x86 now sets up the mcount locations through the build and no longer
      needs to record the ip when the function is executed. This patch changes
      the initial mcount to simply return. There's no need to do any other work.
      If the ftrace start up test fails, the original mcount will be what everything
      will use, so having this as fast as possible is a good thing.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a37605c
  23. 13 10月, 2008 6 次提交
  24. 24 7月, 2008 1 次提交
    • R
      i386 syscall audit fast-path · af0575bb
      Roland McGrath 提交于
      This adds fast paths for 32-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      These paths does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      af0575bb
  25. 19 7月, 2008 1 次提交
  26. 17 7月, 2008 2 次提交
    • R
      x86 ptrace: unify syscall tracing · d4d67150
      Roland McGrath 提交于
      This unifies and cleans up the syscall tracing code on i386 and x86_64.
      
      Using a single function for entry and exit tracing on 32-bit made the
      do_syscall_trace() into some terrible spaghetti.  The logic is clear and
      simple using separate syscall_trace_enter() and syscall_trace_leave()
      functions as on 64-bit.
      
      The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
      on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
      tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.
      
      Changing syscall_trace_enter() to return the syscall number shortens
      all the assembly paths, while adding the SYSEMU feature in a simple way.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      d4d67150
    • R
      x86 ptrace: unify TIF_SINGLESTEP · 64f09733
      Roland McGrath 提交于
      This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
      The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
      on x86_64.  This means the do_notify_resume() path using it is never
      used, so TIF_SINGLESTEP is not cleared on returning to user mode.
      
      Both now leave TIF_SINGLESTEP set when returning to user, so that
      it's already set on an int $0x80 system call entry.  This removes
      the need for testing TF on the system_call path.  Doing it this way
      fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
      introduced by commit 1e2e99f0.
      
      The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
      on a non-exception kernel entry, i.e. sysenter/syscall instruction.
      That will always get to the syscall exit tracing path.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      64f09733
  27. 12 7月, 2008 1 次提交