1. 19 3月, 2014 1 次提交
  2. 20 11月, 2013 2 次提交
    • O
      uprobes: Cleanup !CONFIG_UPROBES decls, unexport xol_area · c912dae6
      Oleg Nesterov 提交于
      1. Don't include asm/uprobes.h unconditionally, we only need
         it if CONFIG_UPROBES.
      
      2. Move the definition of "struct xol_area" into uprobes.c.
      
         Perhaps we should simply kill struct uprobes_state, it buys
         nothing.
      
      3. Kill the dummy definition of uprobe_get_swbp_addr(), nobody
         except handle_swbp() needs it.
      
      4. Purely cosmetic, but move the decl of uprobe_get_swbp_addr()
         up, close to other __weak helpers.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      c912dae6
    • O
      uprobes: Add uprobe_task->dup_xol_work/dup_xol_addr · 32473431
      Oleg Nesterov 提交于
      uprobe_task->vaddr is a bit strange. The generic code uses it only
      to pass the additional argument to arch_uprobe_pre_xol(), and since
      it is always equal to instruction_pointer() this looks even more
      strange.
      
      And both utask->vaddr and and utask->autask have the same scope,
      they only have the meaning when the task executes the probed insn
      out-of-line, so it is safe to reuse both in UTASK_RUNNING state.
      
      This all means that logically ->vaddr belongs to arch_uprobe_task
      and we should probably move it there, arch_uprobe_pre_xol() can
      record instruction_pointer() itself.
      
      OTOH, it is also used by uprobe_copy_process() and dup_xol_work()
      for another purpose, this doesn't look clean and doesn't allow to
      move this member into arch_uprobe_task.
      
      This patch adds the union with 2 anonymous structs into uprobe_task.
      
      The first struct is autask + vaddr, this way we "almost" move vaddr
      into autask.
      
      The second struct has 2 new members for uprobe_copy_process() paths:
      ->dup_xol_addr which can be used instead ->vaddr, and ->dup_xol_work
      which can be used to avoid kmalloc() and simplify the code.
      
      Note that this union will likely have another member(s), we need
      something like "private_data_for_handlers" so that the tracing
      handlers could use it to communicate with call_fetch() methods.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      32473431
  3. 07 11月, 2013 2 次提交
  4. 30 10月, 2013 2 次提交
    • O
      uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK · 3ab67966
      Oleg Nesterov 提交于
      uprobe_copy_process() does nothing if the child shares ->mm with
      the forking process, but there is a special case: CLONE_VFORK.
      In this case it would be more correct to do dup_utask() but avoid
      dup_xol(). This is not that important, the child should not unwind
      its stack too much, this can corrupt the parent's stack, but at
      least we need this to allow to ret-probe __vfork() itself.
      
      Note: in theory, it would be better to check task_pt_regs(p)->sp
      instead of CLONE_VFORK, we need to dup_utask() if and only if the
      child can return from the function called by the parent. But this
      needs the arch-dependant helper, and I think that nobody actually
      does clone(same_stack, CLONE_VM).
      Reported-by: NMartin Cermak <mcermak@redhat.com>
      Reported-by: NDavid Smith <dsmith@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      3ab67966
    • R
      uprobes: Remove the wrong __weak attribute · c2d3f25d
      Ralf Baechle 提交于
      linux/uprobes.h declares arch_uprobe_skip_sstep() as a weak function.
      But as there is no definition of generic version so when trying to build
      uprobes for an architecture that doesn't yet have a arch_uprobe_skip_sstep()
      implementation, the vmlinux will try to call arch_uprobe_skip_sstep()
      somehwere in Stupidhistan leading to a system crash.  We rather want a
      proper link error so remove arch_uprobe_skip_sstep().
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      c2d3f25d
  5. 13 4月, 2013 3 次提交
  6. 04 4月, 2013 1 次提交
  7. 09 2月, 2013 4 次提交
    • O
      uprobes: Introduce uprobe_apply() · bdf8647c
      Oleg Nesterov 提交于
      Currently it is not possible to change the filtering constraints after
      uprobe_register(), so a consumer can not, say, start to trace a task/mm
      which was previously filtered out, or remove the no longer needed bp's.
      
      Introduce uprobe_apply() which simply does register_for_each_vma() again
      to consult uprobe_consumer->filter() and install/remove the breakpoints.
      The only complication is that register_for_each_vma() can no longer
      assume that uprobe->consumers should be consulter if is_register == T,
      so we change it to accept "struct uprobe_consumer *new" instead.
      
      Unlike uprobe_register(), uprobe_apply(true) doesn't do "unregister" if
      register_for_each_vma() fails, it is up to caller to handle the error.
      
      Note: we probably need to cleanup the current interface, it is strange
      that uprobe_apply/unregister need inode/offset. We should either change
      uprobe_register() to return "struct uprobe *", or add a private ->uprobe
      member in uprobe_consumer. And in the long term uprobe_apply() should
      take a single argument, uprobe or consumer, even "bool add" should go
      away.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      bdf8647c
    • O
      uprobes: Teach handler_chain() to filter out the probed task · da1816b1
      Oleg Nesterov 提交于
      Currrently the are 2 problems with pre-filtering:
      
      1. It is not possible to add/remove a task (mm) after uprobe_register()
      
      2. A forked child inherits all breakpoints and uprobe_consumer can not
         control this.
      
      This patch does the first step to improve the filtering. handler_chain()
      removes the breakpoints installed by this uprobe from current->mm if all
      handlers return UPROBE_HANDLER_REMOVE.
      
      Note that handler_chain() relies on ->register_rwsem to avoid the race
      with uprobe_register/unregister which can add/del a consumer, or even
      remove and then insert the new uprobe at the same address.
      
      Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach
      copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes
      sense anyway.
      
      Note: instead of checking the retcode from uc->handler, we could add
      uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to
      call 2 hooks in a row. This buys nothing, and if handler/filter do
      something nontrivial they will probably do the same work twice.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      da1816b1
    • O
      uprobes: Reintroduce uprobe_consumer->filter() · 8a7f2fa0
      Oleg Nesterov 提交于
      Finally add uprobe_consumer->filter() and change consumer_filter()
      to actually call this method.
      
      Note that ->filter() accepts mm_struct, not task_struct. Because:
      
      	1. We do not have for_each_mm_user(mm, task).
      
      	2. Even if we implement for_each_mm_user(), ->filter() can
      	   use it itself.
      
      	3. It is not clear who will actually need this interface to
      	   do the "nontrivial" filtering.
      
      Another argument is "enum uprobe_filter_ctx", consumer->filter() can
      use it to figure out why/where it was called. For example, perhaps
      we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to
      quickly "nack" the unwanted mm's. In this case consumer should know
      that it is called under ->i_mmap_mutex.
      
      See the previous discussion at http://marc.info/?t=135214229700002
      Perhaps we should pass more arguments, vma/vaddr?
      
      Note: this patch obviously can't help to filter out the child created
      by fork(), this will be addressed later.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      8a7f2fa0
    • O
      uprobes: Kill uprobe_consumer->filter() · fe20d71f
      Oleg Nesterov 提交于
      uprobe_consumer->filter() is pointless in its current form, kill it.
      
      We will add it back, but with the different signature/semantics. Perhaps
      we will even re-introduce the callsite in handler_chain(), but not to
      just skip uc->handler().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      fe20d71f
  8. 16 11月, 2012 1 次提交
    • O
      uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race · 32cdba1e
      Oleg Nesterov 提交于
      This was always racy, but 26872090
      "uprobes: Rework register_for_each_vma() to make it O(n)" should be
      blamed anyway, it made everything worse and I didn't notice.
      
      register/unregister call build_map_info() and then do install/remove
      breakpoint for every mm which mmaps inode/offset. This can obviously
      race with fork()->dup_mmap() in between and we can miss the child.
      
      uprobe_register() could be easily fixed but unregister is much worse,
      the new mm inherits "int3" from parent and there is no way to detect
      this if uprobe goes away.
      
      So this patch simply adds percpu_down_read/up_read around dup_mmap(),
      and percpu_down_write/up_write into register_for_each_vma().
      
      This adds 2 new hooks into dup_mmap() but we can kill uprobe_dup_mmap()
      and fold it into uprobe_end_dup_mmap().
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      32cdba1e
  9. 04 11月, 2012 1 次提交
  10. 08 10月, 2012 1 次提交
  11. 30 9月, 2012 1 次提交
    • O
      uprobes: Kill UTASK_BP_HIT state · 1b08e907
      Oleg Nesterov 提交于
      Kill UTASK_BP_HIT state, it buys nothing but complicates the code.
      It is only used in uprobe_notify_resume() to decide who should be
      called, we can check utask->active_uprobe != NULL instead. And this
      allows us to simplify handle_swbp(), no need to clear utask->state.
      
      Likewise we could kill UTASK_SSTEP, but UTASK_BP_HIT is worse and
      imho should die. The problem is, it creates the special case when
      task->utask is NULL, we can't distinguish RUNNING and BP_HIT. With
      this patch utask == NULL always means RUNNING.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      1b08e907
  12. 15 9月, 2012 1 次提交
  13. 29 8月, 2012 4 次提交
    • O
      uprobes: Remove "verify" argument from set_orig_insn() · ded86e7c
      Oleg Nesterov 提交于
      Nobody does set_orig_insn(verify => false), and I think nobody will.
      Remove this argument. IIUC set_orig_insn(verify => false) was needed
      to single-step without xol area.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      ded86e7c
    • O
      uprobes: Fold uprobe_reset_state() into uprobe_dup_mmap() · 61559a81
      Oleg Nesterov 提交于
      Now that we have uprobe_dup_mmap() we can fold uprobe_reset_state()
      into the new hook and remove it. mmput()->uprobe_clear_state() can't
      be called before dup_mmap().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      61559a81
    • O
      uprobes: Introduce MMF_HAS_UPROBES · f8ac4ec9
      Oleg Nesterov 提交于
      Add the new MMF_HAS_UPROBES flag. It is set by install_breakpoint()
      and it is copied by dup_mmap(), uprobe_pre_sstep_notifier() checks
      it to avoid the slow path if the task was never probed. Perhaps it
      makes sense to check it in valid_vma(is_register => false) as well.
      
      This needs the new dup_mmap()->uprobe_dup_mmap() hook. We can't use
      uprobe_reset_state() or put MMF_HAS_UPROBES into MMF_INIT_MASK, we
      need oldmm->mmap_sem to avoid the race with uprobe_register() or
      mmap() from another thread.
      
      Currently we never clear this bit, it can be false-positive after
      uprobe_unregister() or uprobe_munmap() or if dup_mmap() hits the
      probed VM_DONTCOPY vma. But this is fine correctness-wise and has
      no effect unless the task hits the non-uprobe breakpoint.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      f8ac4ec9
    • O
      uprobes: Kill uprobes_state->count · 647c42df
      Oleg Nesterov 提交于
      uprobes_state->count is only needed to avoid the slow path in
      uprobe_pre_sstep_notifier(). It is also checked in uprobe_munmap()
      but ironically its only goal to decrement this counter. However,
      it is very broken. Just some examples:
      
      - uprobe_mmap() can race with uprobe_unregister() and wrongly
        increment the counter if it hits the non-uprobe "int3". Note
        that install_breakpoint() checks ->consumers first and returns
        -EEXIST if it is NULL.
      
        "atomic_sub() if error" in uprobe_mmap() looks obviously wrong
        too.
      
      - uprobe_munmap() can race with uprobe_register() and wrongly
        decrement the counter by the same reason.
      
      - Suppose an appication tries to increase the mmapped area via
        sys_mremap(). vma_adjust() does uprobe_munmap(whole_vma) first,
        this can nullify the counter temporarily and race with another
        thread which can hit the bp, the application will be killed by
        SIGTRAP.
      
      - Suppose an application mmaps 2 consecutive areas in the same file
        and one (or both) of these areas has uprobes. In the likely case
        mmap_region()->vma_merge() suceeds. Like above, this leads to
        uprobe_munmap/uprobe_mmap from vma_merge()->vma_adjust() but then
        mmap_region() does another uprobe_mmap(resulting_vma) and doubles
        the counter.
      
      This patch only removes this counter and fixes the compile errors,
      then we will try to cleanup the changed code and add something else
      instead.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      647c42df
  14. 14 4月, 2012 1 次提交
    • S
      uprobes/core: Decrement uprobe count before the pages are unmapped · cbc91f71
      Srikar Dronamraju 提交于
      Uprobes has a callback (uprobe_munmap()) in the unmap path to
      maintain the uprobes count.
      
      In the exit path this callback gets called in unlink_file_vma().
      However by the time unlink_file_vma() is called, the pages would
      have been unmapped (in unmap_vmas()) and the task->rss_stat counts
      accounted (in zap_pte_range()).
      
      If the exiting process has probepoints, uprobe_munmap() checks if
      the breakpoint instruction was around before decrementing the probe
      count.
      
      This results in a file backed page being reread by uprobe_munmap()
      and hence it does not find the breakpoint.
      
      This patch fixes this problem by moving the callback to
      unmap_single_vma(). Since unmap_single_vma() may not unmap the
      complete vma, add start and end parameters to uprobe_munmap().
      
      This bug became apparent courtesy of commit c3f0327f
      ("mm: add rss counters consistency check").
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103527.23245.9835.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cbc91f71
  15. 31 3月, 2012 2 次提交
    • S
      uprobes/core: Optimize probe hits with the help of a counter · 682968e0
      Srikar Dronamraju 提交于
      Maintain a per-mm counter: number of uprobes that are inserted
      on this process address space.
      
      This counter can be used at probe hit time to determine if we
      need a lookup in the uprobes rbtree. Everytime a probe gets
      inserted successfully, the probe count is incremented and
      everytime a probe gets removed, the probe count is decremented.
      
      The new uprobe_munmap hook ensures the count is correct on a
      unmap or remap of a region. We expect that once a
      uprobe_munmap() is called, the vma goes away.  So
      uprobe_unregister() finding a probe to unregister would either
      mean unmap event hasnt occurred yet or a mmap event on the same
      executable file occured after a unmap event.
      
      Additionally, uprobe_mmap hook now also gets called:
      
       a. on every executable vma that is COWed at fork.
       b. a vma of interest is newly mapped; breakpoint insertion also
          happens at the required address.
      
      On process creation, make sure the probes count in the child is
      set correctly.
      
      Special cases that are taken care include:
      
       a. mremap
       b. VM_DONTCOPY vmas on fork()
       c. insertion/removal races in the parent during fork().
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120330182646.10018.85805.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      682968e0
    • S
      uprobes/core: Allocate XOL slots for uprobes use · d4b3b638
      Srikar Dronamraju 提交于
      Uprobes executes the original instruction at a probed location
      out of line. For this, we allocate a page (per mm) upon the
      first uprobe hit, in the process user address space, divide it
      into slots that are used to store the actual instructions to be
      singlestepped. These slots are known as xol (execution out of
      line) slots.
      
      Care is taken to ensure that the allocation is in an unmapped
      area as close to the top of the user address space as possible,
      with appropriate permission settings to keep selinux like
      frameworks happy.
      
      Upon a uprobe hit, a free slot is acquired, and is released
      after the singlestep completes.
      
      Lots of improvements courtesy suggestions/inputs from Peter and
      Oleg.
      
      [ Folded a fix for build issue on powerpc fixed and reported by
        Stephen Rothwell. ]
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120330182631.10018.48175.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d4b3b638
  16. 14 3月, 2012 1 次提交
    • S
      uprobes/core: Handle breakpoint and singlestep exceptions · 0326f5a9
      Srikar Dronamraju 提交于
      Uprobes uses exception notifiers to get to know if a thread hit
      a breakpoint or a singlestep exception.
      
      When a thread hits a uprobe or is singlestepping post a uprobe
      hit, the uprobe exception notifier sets its TIF_UPROBE bit,
      which will then be checked on its return to userspace path
      (do_notify_resume() ->uprobe_notify_resume()), where the
      consumers handlers are run (in task context) based on the
      defined filters.
      
      Uprobe hits are thread specific and hence we need to maintain
      information about if a task hit a uprobe, what uprobe was hit,
      the slot where the original instruction was copied for xol so
      that it can be singlestepped with appropriate fixups.
      
      In some cases, special care is needed for instructions that are
      executed out of line (xol). These are architecture specific
      artefacts, such as handling RIP relative instructions on x86_64.
      
      Since the instruction at which the uprobe was inserted is
      executed out of line, architecture specific fixups are added so
      that the thread continues normal execution in the presence of a
      uprobe.
      
      Postpone the signals until we execute the probed insn.
      post_xol() path does a recalc_sigpending() before return to
      user-mode, this ensures the signal can't be lost.
      
      Uprobes relies on DIE_DEBUG notification to notify if a
      singlestep is complete.
      
      Adds x86 specific uprobe exception notifiers and appropriate
      hooks needed to determine a uprobe hit and subsequent post
      processing.
      
      Add requisite x86 fixups for xol for uprobes. Specific cases
      needing fixups include relative jumps (x86_64), calls, etc.
      
      Where possible, we check and skip singlestepping the
      breakpointed instructions. For now we skip single byte as well
      as few multibyte nop instructions. However this can be extended
      to other instructions too.
      
      Credits to Oleg Nesterov for suggestions/patches related to
      signal, breakpoint, singlestep handling code.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120313180011.29771.89027.sendpatchset@srdronam.in.ibm.com
      [ Performed various cleanliness edits ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0326f5a9
  17. 13 3月, 2012 3 次提交
  18. 22 2月, 2012 3 次提交
    • I
      uprobes: Update copyright notices · 35aa621b
      Ingo Molnar 提交于
      Add Peter Zijlstra's copyright to the uprobes code, whose
      contributions to the uprobes code are not visible in the Git
      history, because they were backmerged.
      
      Also update existing copyright notices to the year 2012.
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/n/tip-vjqxst502pc1efz7ah8cyht4@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      35aa621b
    • S
      uprobes/core: Move insn to arch specific structure · 3ff54efd
      Srikar Dronamraju 提交于
      Few cleanups suggested by Ingo Molnar.
      
      - Rename struct uprobe_arch_info to struct arch_uprobe.
      - Move insn from struct uprobe to struct arch_uprobe.
      - Make arch specific uprobe functions to accept struct arch_uprobe
        instead of  struct uprobe.
      - Move struct uprobe to kernel/uprobes.c from include/linux/uprobes.h
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Stone <jistone@redhat.com>
      Link: http://lkml.kernel.org/r/20120222091602.15880.40249.sendpatchset@srdronam.in.ibm.com
      [ Made various small improvements ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ff54efd
    • S
      uprobes/core: Remove uprobe_opcode_sz · 96379f60
      Srikar Dronamraju 提交于
      uprobe_opcode_sz refers to the smallest instruction size for
      that architecture. UPROBES_BKPT_INSN_SIZE refers to the size of
      the breakpoint instruction for that architecture.
      
      For now we are assuming that both uprobe_opcode_sz and
      UPROBES_BKPT_INSN_SIZE are the same for all archs and hence
      removing uprobe_opcode_sz in favour of UPROBES_BKPT_INSN_SIZE.
      
      However if we have to support architectures where the smallest
      instruction size is different from the size of breakpoint
      instruction, we may have to re-introduce uprobe_opcode_sz.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Stone <jistone@redhat.com>
      Link: http://lkml.kernel.org/r/20120222091549.15880.67020.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      96379f60
  19. 17 2月, 2012 2 次提交
    • I
      uprobes/core: Clean up, refactor and improve the code · 7b2d81d4
      Ingo Molnar 提交于
      Make the uprobes code readable to me:
      
       - improve the Kconfig text so that a mere mortal gets some idea
         what CONFIG_UPROBES=y is really about
      
       - do trivial renames to standardize around the uprobes_*() namespace
      
       - clean up and simplify various code flow details
      
       - separate basic blocks of functionality
      
       - line break artifact and white space related removal
      
       - use standard local varible definition blocks
      
       - use vertical spacing to make things more readable
      
       - remove unnecessary volatile
      
       - restructure comment blocks to make them more uniform and
         more readable in general
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      7b2d81d4
    • S
      uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints · 2b144498
      Srikar Dronamraju 提交于
      Add uprobes support to the core kernel, with x86 support.
      
      This commit adds the kernel facilities, the actual uprobes
      user-space ABI and perf probe support comes in later commits.
      
      General design:
      
      Uprobes are maintained in an rb-tree indexed by inode and offset
      (the offset here is from the start of the mapping). For a unique
      (inode, offset) tuple, there can be at most one uprobe in the
      rb-tree.
      
      Since the (inode, offset) tuple identifies a unique uprobe, more
      than one user may be interested in the same uprobe. This provides
      the ability to connect multiple 'consumers' to the same uprobe.
      
      Each consumer defines a handler and a filter (optional). The
      'handler' is run every time the uprobe is hit, if it matches the
      'filter' criteria.
      
      The first consumer of a uprobe causes the breakpoint to be
      inserted at the specified address and subsequent consumers are
      appended to this list.  On subsequent probes, the consumer gets
      appended to the existing list of consumers. The breakpoint is
      removed when the last consumer unregisters. For all other
      unregisterations, the consumer is removed from the list of
      consumers.
      
      Given a inode, we get a list of the mms that have mapped the
      inode. Do the actual registration if mm maps the page where a
      probe needs to be inserted/removed.
      
      We use a temporary list to walk through the vmas that map the
      inode.
      
      - The number of maps that map the inode, is not known before we
        walk the rmap and keeps changing.
      - extending vm_area_struct wasn't recommended, it's a
        size-critical data structure.
      - There can be more than one maps of the inode in the same mm.
      
      We add callbacks to the mmap methods to keep an eye on text vmas
      that are of interest to uprobes.  When a vma of interest is mapped,
      we insert the breakpoint at the right address.
      
      Uprobe works by replacing the instruction at the address defined
      by (inode, offset) with the arch specific breakpoint
      instruction. We save a copy of the original instruction at the
      uprobed address.
      
      This is needed for:
      
       a. executing the instruction out-of-line (xol).
       b. instruction analysis for any subsequent fixups.
       c. restoring the instruction back when the uprobe is unregistered.
      
      We insert or delete a breakpoint instruction, and this
      breakpoint instruction is assumed to be the smallest instruction
      available on the platform. For fixed size instruction platforms
      this is trivially true, for variable size instruction platforms
      the breakpoint instruction is typically the smallest (often a
      single byte).
      
      Writing the instruction is done by COWing the page and changing
      the instruction during the copy, this even though most platforms
      allow atomic writes of the breakpoint instruction. This also
      mirrors the behaviour of a ptrace() memory write to a PRIVATE
      file map.
      
      The core worker is derived from KSM's replace_page() logic.
      
      In essence, similar to KSM:
      
       a. allocate a new page and copy over contents of the page that
          has the uprobed vaddr
       b. modify the copy and insert the breakpoint at the required
          address
       c. switch the original page with the copy containing the
          breakpoint
       d. flush page tables.
      
      replace_page() is being replicated here because of some minor
      changes in the type of pages and also because Hugh Dickins had
      plans to improve replace_page() for KSM specific work.
      
      Instruction analysis on x86 is based on instruction decoder and
      determines if an instruction can be probed and determines the
      necessary fixups after singlestep.  Instruction analysis is done
      at probe insertion time so that we avoid having to repeat the
      same analysis every time a probe is hit.
      
      A lot of code here is due to the improvement/suggestions/inputs
      from Peter Zijlstra.
      
      Changelog:
      
      (v10):
       - Add code to clear REX.B prefix as suggested by Denys Vlasenko
         and Masami Hiramatsu.
      
      (v9):
       - Use insn_offset_modrm as suggested by Masami Hiramatsu.
      
      (v7):
      
       Handle comments from Peter Zijlstra:
      
       - Dont take reference to inode. (expect inode to uprobe_register to be sane).
       - Use PTR_ERR to set the return value.
       - No need to take reference to inode.
       - use PTR_ERR to return error value.
       - register and uprobe_unregister share code.
      
      (v5):
      
       - Modified del_consumer as per comments from Peter.
       - Drop reference to inode before dropping reference to uprobe.
       - Use i_size_read(inode) instead of inode->i_size.
       - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
       - Includes errno.h as recommended by Stephen Rothwell to fix a build issue
         on sparc defconfig
       - Remove restrictions while unregistering.
       - Earlier code leaked inode references under some conditions while
         registering/unregistering.
       - Continue the vma-rmap walk even if the intermediate vma doesnt
         meet the requirements.
       - Validate the vma found by find_vma before inserting/removing the
         breakpoint
       - Call del_consumer under mutex_lock.
       - Use hash locks.
       - Handle mremap.
       - Introduce find_least_offset_node() instead of close match logic in
         find_uprobe
       - Uprobes no more depends on MM_OWNER; No reference to task_structs
         while inserting/removing a probe.
       - Uses read_mapping_page instead of grab_cache_page so that the pages
         have valid content.
       - pass NULL to get_user_pages for the task parameter.
       - call SetPageUptodate on the new page allocated in write_opcode.
       - fix leaking a reference to the new page under certain conditions.
       - Include Instruction Decoder if Uprobes gets defined.
       - Remove const attributes for instruction prefix arrays.
       - Uses mm_context to know if the application is 32 bit.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Also-written-by: NJim Keniston <jkenisto@us.ibm.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
      [ Made various small edits to the commit log ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b144498