1. 30 7月, 2012 5 次提交
  2. 18 6月, 2012 4 次提交
  3. 16 6月, 2012 15 次提交
  4. 08 6月, 2012 1 次提交
  5. 06 6月, 2012 7 次提交
    • O
      uprobes: Kill uprobes_srcu/uprobe_srcu_id · 778b032d
      Oleg Nesterov 提交于
      Kill the no longer needed uprobes_srcu/uprobe_srcu_id code.
      
      It doesn't really work anyway. synchronize_srcu() can only
      synchronize with the code "inside" the
      srcu_read_lock/srcu_read_unlock section, while
      uprobe_pre_sstep_notifier() does srcu_read_lock() _after_ we
      already hit the breakpoint.
      
      I guess this probably works "in practice". synchronize_srcu() is
      slow and it implies synchronize_sched(), and the probed task
      enters the non- preemptible section at the start of exception
      handler. Still this is not right at least in theory, and
      task->uprobe_srcu_id blows task_struct.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529193008.GG8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      778b032d
    • O
      uprobes: Teach handle_swbp() to rely on "is_swbp" rather than uprobes_srcu · 56bb4cf6
      Oleg Nesterov 提交于
      Currently handle_swbp() assumes that it can't race with
      unregister, so it roughly does:
      
      	if (find_uprobe(vaddr))
      		process_uprobe();
      	else
      		send_sig(SIGTRAP);
      
      This relies on the not-really-working uprobes_srcu code we are
      going to remove, see the next patch.
      
      With this patch we rely on the result of
      is_swbp_at_addr(bp_vaddr) if find_uprobe() fails.
      
      If is_swbp == 1, then we hit the normal int3, we should send
      SIGTRAP.
      
      If is_swbp == 0, we raced with uprobe_unregister(), we simply
      restart this insn again.
      
      The "difficult" case is is_swbp == -EFAULT, when we can't read
      this memory. In this case I think we should restart too, and
      this is more correct compared to the current code which sends
      SIGTRAP.
      
      Ignoring ENOMEM/etc from get_user_pages(), this can only happen
      if another thread unmaps this memory before find_active_uprobe()
      takes mmap_sem. It would be better to pretend it was unmapped
      before this insn was executed, restart, and get SIGSEGV.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192947.GF8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      56bb4cf6
    • O
      uprobes: Change register_for_each_vma() to take mm->mmap_sem for writing · 77fc4af1
      Oleg Nesterov 提交于
      Change register_for_each_vma() to take mm->mmap_sem for writing.
      This is a bit unfortunate but hopefully not too bad, this is the
      slow path anyway.
      
      This is needed to ensure that find_active_uprobe() can not race
      with uprobe_register() which adds the new bp at the same
      bp_vaddr, after find_uprobe() fails and before
      is_swbp_at_addr_fast() checks the memory.
      
      IOW, this is needed to ensure that if find_active_uprobe()
      returns NULL but is_swbp == true, we can safely assume that it
      was the "normal" int3 and we should send SIGTRAP.
      
      There is another reason for this change. We are going to replace
      uprobes_state->count with MMF_ flags set by register/unregister
      and cleared by find_active_uprobe(), and set/clear shouldn't
      race with each other.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192928.GE8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      77fc4af1
    • O
      uprobes: Teach find_active_uprobe() to provide the "is_swbp" info · d790d346
      Oleg Nesterov 提交于
      A separate patch to simplify the review, and for the
      documentation.
      
      The patch adds another "int *is_swbp" argument to
      find_active_uprobe(), so far its only caller doesn't use this
      info.
      
      With this patch find_active_uprobe() additionally does:
      
      	- if find_vma() + ->vm_start check fails, *is_swbp = -EFAULT
      
      	- otherwise, if valid_vma() + find_uprobe() fails, it holds
      	  the result of is_swbp_at_addr(), can be negative too. The
      	  latter is only possible if we raced with another thread
      	  which did munmap/etc after we hit this bp.
      
      IOW. If find_active_uprobe(&is_swbp) returns NULL, the caller
      can look at is_swbp to figure out whether the current insn is bp
      or not, or detect the race with another thread if it is
      negative.
      
      Note: I think that performance-wise this change is fine. This
      adds is_swbp_at_addr(), but only if we raced with
      uprobe_unregister() or if we hit the "normal" int3 but this mm
      has uprobes as well. And even in this case the slow
      read_opcode() path is very unlikely, this insn recently
      triggered do_int3(), __copy_from_user_inatomic() shouldn't fail
      in the likely case.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192914.GD8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d790d346
    • O
      uprobes: Introduce find_active_uprobe() helper · 3a9ea052
      Oleg Nesterov 提交于
      No functional changes. Move the "find uprobe" code from
      handle_swbp() to the new helper, find_active_uprobe().
      
      Note: with or without this change, the find-active-uprobe logic
      is not exactly right. We can race with another thread which
      unmaps the memory with the valid uprobe before we take
      mm->mmap_sem. We can't find this uprobe simply because
      find_vma() fails. In this case we wrongly assume that this trap
      was not caused by uprobe and send the erroneous SIGTRAP. See the
      next changes.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192857.GC8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3a9ea052
    • O
      uprobes: Change read_opcode() to use FOLL_FORCE · a3d7bb47
      Oleg Nesterov 提交于
      set_orig_insn()->read_opcode() should not fail if the probed
      task did mprotect() after uprobe_register(), change it to use
      FOLL_FORCE. Without FOLL_WRITE this doesn't have any "side"
      effect but allows to read the !VM_READ memory.
      
      There is another reason for this change, we are going to use
      is_swbp_at_addr() from handle_swbp() which can race with another
      thread doing mprotect().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192759.GB8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a3d7bb47
    • O
      uprobes: Optimize is_swbp_at_addr() for current->mm · c00b2750
      Oleg Nesterov 提交于
      Change is_swbp_at_addr() to try to avoid the costly
      read_opcode() if mm == current->mm, __copy_from_user_inatomic()
      should succeed in the likely case.
      
      Currently this optimization is not important, but we are going
      to add more is_swbp_at_addr(current->mm) callers.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192744.GA8057@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c00b2750
  6. 01 6月, 2012 1 次提交
  7. 23 5月, 2012 1 次提交
    • J
      Revert "sched, perf: Use a single callback into the scheduler" · ab0cce56
      Jiri Olsa 提交于
      This reverts commit cb04ff9a ("sched, perf: Use a single
      callback into the scheduler").
      
      Before this change was introduced, the process switch worked
      like this (wrt. to perf event schedule):
      
           schedule (prev, next)
             - schedule out all perf events for prev
             - switch to next
             - schedule in all perf events for current (next)
      
      After the commit, the process switch looks like:
      
           schedule (prev, next)
             - schedule out all perf events for prev
             - schedule in all perf events for (next)
             - switch to next
      
      The problem is, that after we schedule perf events in, the pmu
      is enabled and we can receive events even before we make the
      switch to next - so "current" still being prev process (event
      SAMPLE data are filled based on the value of the "current"
      process).
      
      Thats exactly what we see for test__PERF_RECORD test. We receive
      SAMPLES with PID of the process that our tracee is scheduled
      from.
      
      Discussed with Peter Zijlstra:
      
       > Bah!, yeah I guess reverting is the right thing for now. Sad
       > though.
       >
       > So by having the two hooks we have a black-spot between them
       > where we receive no events at all, this black-spot covers the
       > hand-over of current and we thus don't receive the 'wrong'
       > events.
       >
       > I rather liked we could do away with both that black-spot and
       > clean up the code a little, but apparently people rely on it.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: acme@redhat.com
      Cc: paulus@samba.org
      Cc: cjashfor@linux.vnet.ibm.com
      Cc: fweisbec@gmail.com
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/20120523111302.GC1638@m.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ab0cce56
  8. 09 5月, 2012 2 次提交
  9. 26 4月, 2012 2 次提交
  10. 14 4月, 2012 2 次提交
    • S
      uprobes/core: Decrement uprobe count before the pages are unmapped · cbc91f71
      Srikar Dronamraju 提交于
      Uprobes has a callback (uprobe_munmap()) in the unmap path to
      maintain the uprobes count.
      
      In the exit path this callback gets called in unlink_file_vma().
      However by the time unlink_file_vma() is called, the pages would
      have been unmapped (in unmap_vmas()) and the task->rss_stat counts
      accounted (in zap_pte_range()).
      
      If the exiting process has probepoints, uprobe_munmap() checks if
      the breakpoint instruction was around before decrementing the probe
      count.
      
      This results in a file backed page being reread by uprobe_munmap()
      and hence it does not find the breakpoint.
      
      This patch fixes this problem by moving the callback to
      unmap_single_vma(). Since unmap_single_vma() may not unmap the
      complete vma, add start and end parameters to uprobe_munmap().
      
      This bug became apparent courtesy of commit c3f0327f
      ("mm: add rss counters consistency check").
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103527.23245.9835.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cbc91f71
    • S
      uprobes/core: Make background page replacement logic account for rss_stat counters · 7396fa81
      Srikar Dronamraju 提交于
      Background page replacement logic adds a new anonymous page
      instead of a file backed (while inserting a breakpoint) /
      anonymous page (while removing a breakpoint).
      
      Hence the uprobes logic should take care to update the
      task->ss_stat counters accordingly.
      
      This bug became apparent courtesy of commit c3f0327f
      ("mm: add rss counters consistency check").
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103516.23245.2700.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7396fa81