1. 29 8月, 2012 5 次提交
    • O
      uprobes: Change uprobe_mmap() to ignore the errors but check fatal_signal_pending() · 5e5be71a
      Oleg Nesterov 提交于
      Once install_breakpoint() fails uprobe_mmap() "ignores" all other
      uprobes and returns the error.
      
      It was never really needed to to stop after the first error, and
      in fact it was always wrong at least in -ENOTSUPP case.
      
      Change uprobe_mmap() to ignore the errors and always return 0.
      This is not what we want in the long term, but until we teach
      the callers to handle the failure it would be better to remove
      the pointless complications. And this doesn't look too bad, the
      only "reasonable" error is ENOMEM but in this case the caller
      should be oom-killed in the likely case or the system has more
      serious problems.
      
      However it makes sense to stop if fatal_signal_pending() == T.
      In particular this helps if the task was oom-killed.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      5e5be71a
    • O
      uprobes: Kill dup_mmap()->uprobe_mmap(), simplify uprobe_mmap/munmap · f1a45d02
      Oleg Nesterov 提交于
      1. Kill dup_mmap()->uprobe_mmap(), it was only needed to calculate
         new_mm->uprobes_state.count removed by the previous patch.
      
         If the forking process has a pending uprobe (int3) in vma, it will
         be copied by copy_page_range(), note that it checks vma->anon_vma
         so "Don't copy ptes" is not possible after install_breakpoint()
         which does anon_vma_prepare().
      
      2. Remove is_swbp_at_addr() and "int count" in uprobe_mmap(). Again,
         this was needed for uprobes_state.count.
      
         As a side effect this fixes the bug pointed out by Srikar,
         this code lacked the necessary put_uprobe().
      
      3. uprobe_munmap() becomes a nop after the previous patch. Remove the
         meaningless code but do not remove the helper, we will need it.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      f1a45d02
    • O
      uprobes: Kill uprobes_state->count · 647c42df
      Oleg Nesterov 提交于
      uprobes_state->count is only needed to avoid the slow path in
      uprobe_pre_sstep_notifier(). It is also checked in uprobe_munmap()
      but ironically its only goal to decrement this counter. However,
      it is very broken. Just some examples:
      
      - uprobe_mmap() can race with uprobe_unregister() and wrongly
        increment the counter if it hits the non-uprobe "int3". Note
        that install_breakpoint() checks ->consumers first and returns
        -EEXIST if it is NULL.
      
        "atomic_sub() if error" in uprobe_mmap() looks obviously wrong
        too.
      
      - uprobe_munmap() can race with uprobe_register() and wrongly
        decrement the counter by the same reason.
      
      - Suppose an appication tries to increase the mmapped area via
        sys_mremap(). vma_adjust() does uprobe_munmap(whole_vma) first,
        this can nullify the counter temporarily and race with another
        thread which can hit the bp, the application will be killed by
        SIGTRAP.
      
      - Suppose an application mmaps 2 consecutive areas in the same file
        and one (or both) of these areas has uprobes. In the likely case
        mmap_region()->vma_merge() suceeds. Like above, this leads to
        uprobe_munmap/uprobe_mmap from vma_merge()->vma_adjust() but then
        mmap_region() does another uprobe_mmap(resulting_vma) and doubles
        the counter.
      
      This patch only removes this counter and fixes the compile errors,
      then we will try to cleanup the changed code and add something else
      instead.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      647c42df
    • S
      uprobes: Remove check for uprobe variable in handle_swbp() · 8bd87445
      Sebastian Andrzej Siewior 提交于
      by the time we get here (after we pass cleanup_ret) uprobe is always is
      set. If it is NULL we leave very early in the code.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      8bd87445
    • S
      uprobes: Remove redundant lock_page/unlock_page · 61e1d394
      Srikar Dronamraju 提交于
      Since read_opcode() reads from the referenced page and doesnt modify
      the page contents nor the page attributes, there is no need to lock
      the page.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      61e1d394
  2. 30 7月, 2012 12 次提交
  3. 16 6月, 2012 15 次提交
  4. 08 6月, 2012 1 次提交
  5. 06 6月, 2012 7 次提交
    • O
      uprobes: Kill uprobes_srcu/uprobe_srcu_id · 778b032d
      Oleg Nesterov 提交于
      Kill the no longer needed uprobes_srcu/uprobe_srcu_id code.
      
      It doesn't really work anyway. synchronize_srcu() can only
      synchronize with the code "inside" the
      srcu_read_lock/srcu_read_unlock section, while
      uprobe_pre_sstep_notifier() does srcu_read_lock() _after_ we
      already hit the breakpoint.
      
      I guess this probably works "in practice". synchronize_srcu() is
      slow and it implies synchronize_sched(), and the probed task
      enters the non- preemptible section at the start of exception
      handler. Still this is not right at least in theory, and
      task->uprobe_srcu_id blows task_struct.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529193008.GG8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      778b032d
    • O
      uprobes: Teach handle_swbp() to rely on "is_swbp" rather than uprobes_srcu · 56bb4cf6
      Oleg Nesterov 提交于
      Currently handle_swbp() assumes that it can't race with
      unregister, so it roughly does:
      
      	if (find_uprobe(vaddr))
      		process_uprobe();
      	else
      		send_sig(SIGTRAP);
      
      This relies on the not-really-working uprobes_srcu code we are
      going to remove, see the next patch.
      
      With this patch we rely on the result of
      is_swbp_at_addr(bp_vaddr) if find_uprobe() fails.
      
      If is_swbp == 1, then we hit the normal int3, we should send
      SIGTRAP.
      
      If is_swbp == 0, we raced with uprobe_unregister(), we simply
      restart this insn again.
      
      The "difficult" case is is_swbp == -EFAULT, when we can't read
      this memory. In this case I think we should restart too, and
      this is more correct compared to the current code which sends
      SIGTRAP.
      
      Ignoring ENOMEM/etc from get_user_pages(), this can only happen
      if another thread unmaps this memory before find_active_uprobe()
      takes mmap_sem. It would be better to pretend it was unmapped
      before this insn was executed, restart, and get SIGSEGV.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192947.GF8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      56bb4cf6
    • O
      uprobes: Change register_for_each_vma() to take mm->mmap_sem for writing · 77fc4af1
      Oleg Nesterov 提交于
      Change register_for_each_vma() to take mm->mmap_sem for writing.
      This is a bit unfortunate but hopefully not too bad, this is the
      slow path anyway.
      
      This is needed to ensure that find_active_uprobe() can not race
      with uprobe_register() which adds the new bp at the same
      bp_vaddr, after find_uprobe() fails and before
      is_swbp_at_addr_fast() checks the memory.
      
      IOW, this is needed to ensure that if find_active_uprobe()
      returns NULL but is_swbp == true, we can safely assume that it
      was the "normal" int3 and we should send SIGTRAP.
      
      There is another reason for this change. We are going to replace
      uprobes_state->count with MMF_ flags set by register/unregister
      and cleared by find_active_uprobe(), and set/clear shouldn't
      race with each other.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192928.GE8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      77fc4af1
    • O
      uprobes: Teach find_active_uprobe() to provide the "is_swbp" info · d790d346
      Oleg Nesterov 提交于
      A separate patch to simplify the review, and for the
      documentation.
      
      The patch adds another "int *is_swbp" argument to
      find_active_uprobe(), so far its only caller doesn't use this
      info.
      
      With this patch find_active_uprobe() additionally does:
      
      	- if find_vma() + ->vm_start check fails, *is_swbp = -EFAULT
      
      	- otherwise, if valid_vma() + find_uprobe() fails, it holds
      	  the result of is_swbp_at_addr(), can be negative too. The
      	  latter is only possible if we raced with another thread
      	  which did munmap/etc after we hit this bp.
      
      IOW. If find_active_uprobe(&is_swbp) returns NULL, the caller
      can look at is_swbp to figure out whether the current insn is bp
      or not, or detect the race with another thread if it is
      negative.
      
      Note: I think that performance-wise this change is fine. This
      adds is_swbp_at_addr(), but only if we raced with
      uprobe_unregister() or if we hit the "normal" int3 but this mm
      has uprobes as well. And even in this case the slow
      read_opcode() path is very unlikely, this insn recently
      triggered do_int3(), __copy_from_user_inatomic() shouldn't fail
      in the likely case.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192914.GD8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d790d346
    • O
      uprobes: Introduce find_active_uprobe() helper · 3a9ea052
      Oleg Nesterov 提交于
      No functional changes. Move the "find uprobe" code from
      handle_swbp() to the new helper, find_active_uprobe().
      
      Note: with or without this change, the find-active-uprobe logic
      is not exactly right. We can race with another thread which
      unmaps the memory with the valid uprobe before we take
      mm->mmap_sem. We can't find this uprobe simply because
      find_vma() fails. In this case we wrongly assume that this trap
      was not caused by uprobe and send the erroneous SIGTRAP. See the
      next changes.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192857.GC8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3a9ea052
    • O
      uprobes: Change read_opcode() to use FOLL_FORCE · a3d7bb47
      Oleg Nesterov 提交于
      set_orig_insn()->read_opcode() should not fail if the probed
      task did mprotect() after uprobe_register(), change it to use
      FOLL_FORCE. Without FOLL_WRITE this doesn't have any "side"
      effect but allows to read the !VM_READ memory.
      
      There is another reason for this change, we are going to use
      is_swbp_at_addr() from handle_swbp() which can race with another
      thread doing mprotect().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192759.GB8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a3d7bb47
    • O
      uprobes: Optimize is_swbp_at_addr() for current->mm · c00b2750
      Oleg Nesterov 提交于
      Change is_swbp_at_addr() to try to avoid the costly
      read_opcode() if mm == current->mm, __copy_from_user_inatomic()
      should succeed in the likely case.
      
      Currently this optimization is not important, but we are going
      to add more is_swbp_at_addr(current->mm) callers.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192744.GA8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c00b2750