1. 21 11月, 2012 2 次提交
    • H
      x86-32: Export kernel_stack_pointer() for modules · cb57a2b4
      H. Peter Anvin 提交于
      Modules, in particular oprofile (and possibly other similar tools)
      need kernel_stack_pointer(), so export it using EXPORT_SYMBOL_GPL().
      
      Cc: Yang Wei <wei.yang@windriver.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Jun Zhang <jun.zhang@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      cb57a2b4
    • R
      x86-32: Fix invalid stack address while in softirq · 10226238
      Robert Richter 提交于
      In 32 bit the stack address provided by kernel_stack_pointer() may
      point to an invalid range causing NULL pointer access or page faults
      while in NMI (see trace below). This happens if called in softirq
      context and if the stack is empty. The address at &regs->sp is then
      out of range.
      
      Fixing this by checking if regs and &regs->sp are in the same stack
      context. Otherwise return the previous stack pointer stored in struct
      thread_info. If that address is invalid too, return address of regs.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000a
       IP: [<c1004237>] print_context_stack+0x6e/0x8d
       *pde = 00000000
       Oops: 0000 [#1] SMP
       Modules linked in:
       Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
       EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
       EIP is at print_context_stack+0x6e/0x8d
       EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
       ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
        DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
       CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
       DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
       DR6: ffff0ff0 DR7: 00000400
       Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
       Stack:
        000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
        f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
        f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
       Call Trace:
        [<c1003723>] dump_trace+0x7b/0xa1
        [<c12dce1c>] x86_backtrace+0x40/0x88
        [<c12db712>] ? oprofile_add_sample+0x56/0x84
        [<c12db731>] oprofile_add_sample+0x75/0x84
        [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
        [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
        [<c1395034>] nmi_handle+0x31/0x4a
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13950ed>] do_nmi+0xa0/0x2ff
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13949e5>] nmi_stack_correct+0x28/0x2d
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c1003603>] ? do_softirq+0x4b/0x7f
        <IRQ>
        [<c102a06f>] irq_exit+0x35/0x5b
        [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
        [<c1394746>] apic_timer_interrupt+0x2a/0x30
       Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
       EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
       CR2: 000000000000000a
       ---[ end trace 62afee3481b00012 ]---
       Kernel panic - not syncing: Fatal exception in interrupt
      
      V2:
      * add comments to kernel_stack_pointer()
      * always return a valid stack address by falling back to the address
        of regs
      Reported-by: NYang Wei <wei.yang@windriver.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Jun Zhang <jun.zhang@intel.com>
      10226238
  2. 26 9月, 2012 1 次提交
    • F
      x86: Syscall hooks for userspace RCU extended QS · bf5a3c13
      Frederic Weisbecker 提交于
      Add syscall slow path hooks to notify syscall entry
      and exit on CPUs that want to support userspace RCU
      extended quiescent state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      bf5a3c13
  3. 19 9月, 2012 1 次提交
    • S
      x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels · 72a671ce
      Suresh Siddha 提交于
      Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied
      to/from the fpstate in the task struct.
      
      And in the case of signal delivery for x86_64 binaries, if the fpstate is live
      in the CPU registers, then the live state is copied directly to the user
      sigframe. Otherwise  fpstate in the task struct is copied to the user sigframe.
      During restore, fpstate in the user sigframe is restored directly to the live
      CPU registers.
      
      Historically, different code paths led to different bugs. For example,
      x86_64 code path was not preemption safe till recently. Also there is lot
      of code duplication for support of new features like xsave etc.
      
      Unify signal handling code paths for x86 and x86_64 kernels.
      
      New strategy is as follows:
      
      Signal delivery: Both for 32/64-bit frames, align the core math frame area to
      64bytes as needed by xsave (this where the main fpu/extended state gets copied
      to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave
      frames). If the state is live, copy the register state directly to the user
      frame. If not live, copy the state in the thread struct to the user frame. And
      for 32-bit [f]xsave frames, construct the fsave header separately before
      the actual [f]xsave area.
      
      Signal return: As the 32-bit frames with [f]xstate has an additional
      'fsave' header, copy everything back from the user sigframe to the
      fpstate in the task structure and reconstruct the fxstate from the 'fsave'
      header (Also user passed pointers may not be correctly aligned for
      any attempt to directly restore any partial state). At the next fpstate usage,
      everything will be restored to the live CPU registers.
      For all the 64-bit frames and the 32-bit fsave frame, restore the state from
      the user sigframe directly to the live CPU registers. 64-bit signals always
      restored the math frame directly, so we can expect the math frame pointer
      to be correctly aligned. For 32-bit fsave frames, there are no alignment
      requirements, so we can restore the state directly.
      
      "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are
      with in the noise range with this change.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com
      [ Merged in compilation fix ]
      Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      72a671ce
  4. 02 6月, 2012 1 次提交
  5. 14 4月, 2012 1 次提交
    • W
      x86: Enable HAVE_ARCH_SECCOMP_FILTER · c6cfbeb4
      Will Drewry 提交于
      Enable support for seccomp filter on x86:
      - syscall_get_arch()
      - syscall_get_arguments()
      - syscall_rollback()
      - syscall_set_return_value()
      - SIGSYS siginfo_t support
      - secure_computing is called from a ptrace_event()-safe context
      - secure_computing return value is checked (see below).
      
      SECCOMP_RET_TRACE and SECCOMP_RET_TRAP may result in seccomp needing to
      skip a system call without killing the process.  This is done by
      returning a non-zero (-1) value from secure_computing.  This change
      makes x86 respect that return value.
      
      To ensure that minimal kernel code is exposed, a non-zero return value
      results in an immediate return to user space (with an invalid syscall
      number).
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Reviewed-by: NH. Peter Anvin <hpa@zytor.com>
      Acked-by: NEric Paris <eparis@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      
      v18: rebase and tweaked change description, acked-by
      v17: added reviewed by and rebased
      v..: all rebases since original introduction.
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      c6cfbeb4
  6. 29 3月, 2012 1 次提交
  7. 13 3月, 2012 1 次提交
  8. 06 3月, 2012 1 次提交
    • H
      x32: Add ptrace for x32 · 55283e25
      H.J. Lu 提交于
      X32 ptrace is a hybrid of 64bit ptrace and compat ptrace with 32bit
      address and longs.  It use 64bit ptrace to access the full 64bit
      registers.  PTRACE_PEEKUSR and PTRACE_POKEUSR are only allowed to access
      segment and debug registers.  PTRACE_PEEKUSR returns the lower 32bits
      and PTRACE_POKEUSR zero-extends 32bit value to 64bit.   It works since
      the upper 32bits of segment and debug registers of x32 process are always
      zero.  GDB only uses PTRACE_PEEKUSR and PTRACE_POKEUSR to access
      segment and debug registers.
      
      [ hpa: changed TIF_X32 test to use !is_ia32_task() instead, and moved
        the system call number to the now-unused 521 slot. ]
      Signed-off-by: N"H.J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/1329696488-16970-1-git-send-email-hpa@zytor.com
      55283e25
  9. 22 2月, 2012 1 次提交
    • L
      i387: Split up <asm/i387.h> into exported and internal interfaces · 1361b83a
      Linus Torvalds 提交于
      While various modules include <asm/i387.h> to get access to things we
      actually *intend* for them to use, most of that header file was really
      pretty low-level internal stuff that we really don't want to expose to
      others.
      
      So split the header file into two: the small exported interfaces remain
      in <asm/i387.h>, while the internal definitions that are only used by
      core architecture code are now in <asm/fpu-internal.h>.
      
      The guiding principle for this was to expose functions that we export to
      modules, and leave them in <asm/i387.h>, while stuff that is used by
      task switching or was marked GPL-only is in <asm/fpu-internal.h>.
      
      The fpu-internal.h file could be further split up too, especially since
      arch/x86/kvm/ uses some of the remaining stuff for its module.  But that
      kvm usage should probably be abstracted out a bit, and at least now the
      internal FPU accessor functions are much more contained.  Even if it
      isn't perhaps as contained as it _could_ be.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      1361b83a
  10. 18 1月, 2012 2 次提交
    • E
      audit: inline audit_syscall_entry to reduce burden on archs · b05d8447
      Eric Paris 提交于
      Every arch calls:
      
      if (unlikely(current->audit_context))
      	audit_syscall_entry()
      
      which requires knowledge about audit (the existance of audit_context) in
      the arch code.  Just do it all in static inline in audit.h so that arch's
      can remain blissfully ignorant.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b05d8447
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
  11. 05 12月, 2011 1 次提交
  12. 01 7月, 2011 2 次提交
    • A
      perf: Add context field to perf_event · 4dc0da86
      Avi Kivity 提交于
      The perf_event overflow handler does not receive any caller-derived
      argument, so many callers need to resort to looking up the perf_event
      in their local data structure.  This is ugly and doesn't scale if a
      single callback services many perf_events.
      
      Fix by adding a context parameter to perf_event_create_kernel_counter()
      (and derived hardware breakpoints APIs) and storing it in the perf_event.
      The field can be accessed from the callback as event->overflow_handler_context.
      All callers are updated.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      4dc0da86
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
  13. 24 5月, 2011 1 次提交
  14. 25 4月, 2011 1 次提交
  15. 28 10月, 2010 2 次提交
  16. 01 5月, 2010 1 次提交
  17. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  18. 26 3月, 2010 1 次提交
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  19. 20 2月, 2010 1 次提交
    • F
      hw-breakpoint: Keep track of dr7 local enable bits · 326264a0
      Frederic Weisbecker 提交于
      When the user enables breakpoints through dr7, he can choose
      between "local" or "global" enable bits but given how linux is
      implemented, both have the same effect.
      
      That said we don't keep track how the user enabled the breakpoints
      so when the user requests the dr7 value, we only translate the
      "enabled" status using the global enabled bits. It means that if
      the user enabled a breakpoint using the local enabled bit, reading
      back dr7 will set the global bit and clear the local one.
      
      Apps like Wine expect a full dr7 POKEUSER/PEEKUSER match for emulated
      softwares that implement old reverse engineering protection schemes.
      
      We fix that by keeping track of the whole dr7 value given by the user
      in the thread structure to drop this bug. We'll think about
      something more proper later.
      
      This fixes a 2.6.32 - 2.6.33-x ptrace regression.
      Reported-and-tested-by: NMichael Stefaniuc <mstefani@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NK.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Maneesh Soni <maneesh@linux.vnet.ibm.com>
      Cc: Alexandre Julliard <julliard@winehq.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
      326264a0
  20. 12 2月, 2010 1 次提交
    • S
      x86, ptrace: regset extensions to support xstate · 5b3efd50
      Suresh Siddha 提交于
      Add the xstate regset support which helps extend the kernel ptrace and the
      core-dump interfaces to support AVX state etc.
      
      This regset interface is designed to support all the future state that gets
      supported using xsave/xrstor infrastructure.
      
      Looking at the memory layout saved by "xsave", one can't say which state
      is represented in the memory layout. This is because if a particular state is
      in init state, in the xsave hdr it can be represented by bit '0'. And hence
      we can't really say by the xsave header wether a state is in init state or
      the state is not saved in the memory layout.
      
      And hence the xsave memory layout available through this regset
      interface uses SW usable bytes [464..511] to convey what state is represented
      in the memory layout.
      
      First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
      mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
      
      The note NT_X86_XSTATE represents the extended state information in the
      core file, using the above mentioned memory layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com>
      Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5b3efd50
  21. 05 2月, 2010 1 次提交
  22. 13 1月, 2010 1 次提交
    • M
      x86/ptrace: Remove unused regs_get_argument_nth API · aa5add93
      Masami Hiramatsu 提交于
      Because of dropping function argument syntax from kprobe-tracer,
      we don't need this API anymore.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linuxppc-dev@ozlabs.org
      LKML-Reference: <20100105224656.19431.92588.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aa5add93
  23. 17 12月, 2009 1 次提交
  24. 16 12月, 2009 2 次提交
  25. 09 12月, 2009 1 次提交
    • F
      hw-breakpoints: Modify breakpoints without unregistering them · 44234adc
      Frederic Weisbecker 提交于
      Currently, when ptrace needs to modify a breakpoint, like disabling
      it, changing its address, type or len, it calls
      modify_user_hw_breakpoint(). This latter will perform the heavy and
      racy task of unregistering the old breakpoint and registering a new
      one.
      
      This is racy as someone else might steal the reserved breakpoint
      slot under us, which is undesired as the breakpoint is only
      supposed to be modified, sometimes in the middle of a debugging
      workflow. We don't want our slot to be stolen in the middle.
      
      So instead of unregistering/registering the breakpoint, just
      disable it while we modify its breakpoint fields and re-enable it
      after if necessary.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1260347148-5519-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      44234adc
  26. 06 12月, 2009 2 次提交
  27. 02 12月, 2009 1 次提交
  28. 27 11月, 2009 1 次提交
  29. 26 11月, 2009 1 次提交
  30. 08 11月, 2009 1 次提交
    • F
      hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c
      Frederic Weisbecker 提交于
      This patch rebase the implementation of the breakpoints API on top of
      perf events instances.
      
      Each breakpoints are now perf events that handle the
      register scheduling, thread/cpu attachment, etc..
      
      The new layering is now made as follows:
      
             ptrace       kgdb      ftrace   perf syscall
                \          |          /         /
                 \         |         /         /
                                              /
                  Core breakpoint API        /
                                            /
                           |               /
                           |              /
      
                    Breakpoints perf events
      
                           |
                           |
      
                     Breakpoints PMU ---- Debug Register constraints handling
                                          (Part of core breakpoint API)
                           |
                           |
      
                   Hardware debug registers
      
      Reasons of this rewrite:
      
      - Use the centralized/optimized pmu registers scheduling,
        implying an easier arch integration
      - More powerful register handling: perf attributes (pinned/flexible
        events, exclusive/non-exclusive, tunable period, etc...)
      
      Impact:
      
      - New perf ABI: the hardware breakpoints counters
      - Ptrace breakpoints setting remains tricky and still needs some per
        thread breakpoints references.
      
      Todo (in the order):
      
      - Support breakpoints perf counter events for perf tools (ie: implement
        perf_bpcounter_event())
      - Support from perf tools
      
      Changes in v2:
      
      - Follow the perf "event " rename
      - The ptrace regression have been fixed (ptrace breakpoint perf events
        weren't released when a task ended)
      - Drop the struct hw_breakpoint and store generic fields in
        perf_event_attr.
      - Separate core and arch specific headers, drop
        asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
      - Use new generic len/type for breakpoint
      - Handle off case: when breakpoints api is not supported by an arch
      
      Changes in v3:
      
      - Fix broken CONFIG_KVM, we need to propagate the breakpoint api
        changes to kvm when we exit the guest and restore the bp registers
        to the host.
      
      Changes in v4:
      
      - Drop the hw_breakpoint_restore() stub as it is only used by KVM
      - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
        module
      - Restore the breakpoints unconditionally on kvm guest exit:
        TIF_DEBUG_THREAD doesn't anymore cover every cases of running
        breakpoints and vcpu->arch.switch_db_regs might not always be
        set when the guest used debug registers.
        (Waiting for a reliable optimization)
      
      Changes in v5:
      
      - Split-up the asm-generic/hw-breakpoint.h moving to
        linux/hw_breakpoint.h into a separate patch
      - Optimize the breakpoints restoring while switching from kvm guest
        to host. We only want to restore the state if we have active
        breakpoints to the host, otherwise we don't care about messed-up
        address registers.
      - Add asm/hw_breakpoint.h to Kbuild
      - Fix bad breakpoint type in trace_selftest.c
      
      Changes in v6:
      
      - Fix wrong header inclusion in trace.h (triggered a build
        error with CONFIG_FTRACE_SELFTEST
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jan Kiszka <jan.kiszka@web.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      24f1e32c
  31. 23 9月, 2009 2 次提交
  32. 11 9月, 2009 1 次提交
    • M
      x86/ptrace: Fix regs_get_argument_nth() to add correct offset · ad5cafcd
      Masami Hiramatsu 提交于
      Fix regs_get_argument_nth() to add correct offset bytes. Because
      offset_of() returns offset in byte, the offset should be added
      to char * instead of unsigned long *.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20090910235306.22412.31613.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      ad5cafcd
  33. 27 8月, 2009 1 次提交
    • M
      x86: Add pt_regs register and stack access APIs · b1cf540f
      Masami Hiramatsu 提交于
      Add following APIs for accessing registers and stack entries from
      pt_regs.
      These APIs are required by kprobes-based event tracer on ftrace.
      Some other debugging tools might be able to use it too.
      
      - regs_query_register_offset(const char *name)
         Query the offset of "name" register.
      
      - regs_query_register_name(unsigned int offset)
         Query the name of register by its offset.
      
      - regs_get_register(struct pt_regs *regs, unsigned int offset)
         Get the value of a register by its offset.
      
      - regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
         Check the address is in the kernel stack.
      
      - regs_get_kernel_stack_nth(struct pt_regs *reg, unsigned int nth)
         Get Nth entry of the kernel stack. (N >= 0)
      
      - regs_get_argument_nth(struct pt_regs *reg, unsigned int nth)
         Get Nth argument at function call. (N >= 0)
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203444.31965.26374.stgit@localhost.localdomain>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      b1cf540f