1. 09 2月, 2013 21 次提交
  2. 25 1月, 2013 1 次提交
  3. 16 11月, 2012 1 次提交
    • O
      uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race · 32cdba1e
      Oleg Nesterov 提交于
      This was always racy, but 26872090
      "uprobes: Rework register_for_each_vma() to make it O(n)" should be
      blamed anyway, it made everything worse and I didn't notice.
      
      register/unregister call build_map_info() and then do install/remove
      breakpoint for every mm which mmaps inode/offset. This can obviously
      race with fork()->dup_mmap() in between and we can miss the child.
      
      uprobe_register() could be easily fixed but unregister is much worse,
      the new mm inherits "int3" from parent and there is no way to detect
      this if uprobe goes away.
      
      So this patch simply adds percpu_down_read/up_read around dup_mmap(),
      and percpu_down_write/up_write into register_for_each_vma().
      
      This adds 2 new hooks into dup_mmap() but we can kill uprobe_dup_mmap()
      and fold it into uprobe_end_dup_mmap().
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      32cdba1e
  4. 15 11月, 2012 1 次提交
  5. 04 11月, 2012 2 次提交
  6. 09 10月, 2012 2 次提交
    • H
      mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end · 6bdb913f
      Haggai Eran 提交于
      In order to allow sleeping during invalidate_page mmu notifier calls, we
      need to avoid calling when holding the PT lock.  In addition to its direct
      calls, invalidate_page can also be called as a substitute for a change_pte
      call, in case the notifier client hasn't implemented change_pte.
      
      This patch drops the invalidate_page call from change_pte, and instead
      wraps all calls to change_pte with invalidate_range_start and
      invalidate_range_end calls.
      
      Note that change_pte still cannot sleep after this patch, and that clients
      implementing change_pte should not take action on it in case the number of
      outstanding invalidate_range_start calls is larger than one, otherwise
      they might miss a later invalidation.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Cc: Andrea Arcangeli <andrea@qumranet.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Haggai Eran <haggaie@mellanox.com>
      Cc: Shachar Raindel <raindel@mellanox.com>
      Cc: Liran Liss <liranl@mellanox.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bdb913f
    • M
      mm: replace vma prio_tree with an interval tree · 6b2dbba8
      Michel Lespinasse 提交于
      Implement an interval tree as a replacement for the VMA prio_tree.  The
      algorithms are similar to lib/interval_tree.c; however that code can't be
      directly reused as the interval endpoints are not explicitly stored in the
      VMA.  So instead, the common algorithm is moved into a template and the
      details (node type, how to get interval endpoints from the node, etc) are
      filled in using the C preprocessor.
      
      Once the interval tree functions are available, using them as a
      replacement to the VMA prio tree is a relatively simple, mechanical job.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b2dbba8
  7. 08 10月, 2012 6 次提交
    • O
      uprobes: Fix the racy uprobe->flags manipulation · 71434f2f
      Oleg Nesterov 提交于
      Multiple threads can manipulate uprobe->flags, this is obviously
      unsafe. For example mmap can set UPROBE_COPY_INSN while register
      tries to set UPROBE_RUN_HANDLER, the latter can also race with
      can_skip_sstep() which clears UPROBE_SKIP_SSTEP.
      
      Change this code to use bitops.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      71434f2f
    • O
      uprobes: Fix prepare_uprobe() race with itself · 4710f05f
      Oleg Nesterov 提交于
      install_breakpoint() is called under mm->mmap_sem, this protects
      set_swbp() but not prepare_uprobe(). Two or more different tasks
      can call install_breakpoint()->prepare_uprobe() at the same time,
      this leads to numerous problems if UPROBE_COPY_INSN is not set.
      
      Just for example, the second copy_insn() can corrupt the already
      analyzed/fixuped uprobe->arch.insn and race with handle_swbp().
      
      This patch simply adds uprobe->copy_mutex to serialize this code.
      We could probably reuse ->consumer_rwsem, but this would mean that
      consumer->handler() can not use mm->mmap_sem, not good.
      
      Note: this is another temporary ugly hack until we move this logic
      into uprobe_register().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      4710f05f
    • O
      uprobes: Introduce prepare_uprobe() · cb9a19fe
      Oleg Nesterov 提交于
      Preparation. Extract the copy_insn/arch_uprobe_analyze_insn code
      from install_breakpoint() into the new helper, prepare_uprobe().
      
      And move uprobe->flags defines from uprobes.h to uprobes.c, nobody
      else can use them anyway.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      cb9a19fe
    • O
      uprobes: Fix handle_swbp() vs unregister() + register() race · 142b18dd
      Oleg Nesterov 提交于
      Strictly speaking this race was added by me in 56bb4cf6. However
      I think that this bug is just another indication that we should
      move copy_insn/uprobe_analyze_insn code from install_breakpoint()
      to uprobe_register(), there are a lot of other reasons for that.
      Until then, add a hack to close the race.
      
      A task can hit uprobe U1, but before it calls find_uprobe() this
      uprobe can be unregistered *AND* another uprobe U2 can be added to
      uprobes_tree at the same inode/offset. In this case handle_swbp()
      will use the not-fully-initialized U2, in particular its arch.insn
      for xol.
      
      Add the additional !UPROBE_COPY_INSN check into handle_swbp(),
      if this flag is not set we simply restart as if the new uprobe was
      not inserted yet. This is not very nice, we need barriers, but we
      will remove this hack when we change uprobe_register().
      
      Note: with or without this patch install_breakpoint() can race with
      itself, yet another reson to kill UPROBE_COPY_INSN altogether. And
      even the usage of uprobe->flags is not safe. See the next patches.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      142b18dd
    • O
      uprobes: Do not delete uprobe if uprobe_unregister() fails · 076a365b
      Oleg Nesterov 提交于
      delete_uprobe() must not be called if register_for_each_vma(false)
      fails to remove all breakpoints, __uprobe_unregister() is correct.
      The problem is that register_for_each_vma(false) always returns 0
      and thus this logic does not work.
      
      1. Change verify_opcode() to return 0 rather than -EINVAL when
         unregister detects the !is_swbp insn, we can treat this case
         as success and currently unregister paths ignore the error
         code anyway.
      
      2. Change remove_breakpoint() to propagate the error code from
         write_opcode().
      
      3. Change register_for_each_vma(is_register => false) to remove
         as much breakpoints as possible but return non-zero if
         remove_breakpoint() fails at least once.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      076a365b
    • O
      uprobes: Don't return success if alloc_uprobe() fails · a5f658b7
      Oleg Nesterov 提交于
      If alloc_uprobe() fails uprobe_register() should return ENOMEM, not 0.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      a5f658b7
  8. 30 9月, 2012 6 次提交
    • O
      uprobes: Simplify is_swbp_at_addr(), remove stale comments · ec75fba9
      Oleg Nesterov 提交于
      After the previous change is_swbp_at_addr() is always called with
      current->mm. Remove this check and move it close to its single caller.
      
      Also, remove the obsolete comment about is_swbp_at_addr() and
      uprobe_state.count.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      ec75fba9
    • O
      uprobes: Kill set_orig_insn()->is_swbp_at_addr() · ed6f6a50
      Oleg Nesterov 提交于
      Unlike set_swbp(), set_orig_insn()->is_swbp_at_addr() makes sense,
      although it can't prevent all confusions.
      
      But the usage of is_swbp_at_addr() is equally confusing, and it adds
      the extra get_user_pages() we can avoid.
      
      This patch removes set_orig_insn()->is_swbp_at_addr() but changes
      write_opcode() to do the necessary checks before replace_page().
      
      Perhaps it also makes sense to ensure PAGE_MAPPING_ANON in unregister
      case.
      
      find_active_uprobe() becomes the only user of is_swbp_at_addr(),
      we can change its semantics.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      ed6f6a50
    • O
      uprobes: Introduce copy_opcode(), kill read_opcode() · cceb55aa
      Oleg Nesterov 提交于
      No functional changes, preparations.
      
      1. Extract the kmap-and-memcpy code from read_opcode() into the
         new trivial helper, copy_opcode(). The next patch will add
         another user.
      
      2. read_opcode() becomes really trivial, fold it into its single
         caller, is_swbp_at_addr().
      
      3. Remove "auprobe" argument from write_opcode(), it is not used
         since f403072c.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      cceb55aa
    • O
      uprobes: Kill set_swbp()->is_swbp_at_addr() · e97f65a1
      Oleg Nesterov 提交于
      A separate patch for better documentation.
      
      set_swbp()->is_swbp_at_addr() is not needed for correctness, it is
      harmless to do the unnecessary __replace_page(old_page, new_page)
      when these 2 pages are identical.
      
      And it can not be counted as optimization. mmap/register races are
      very unlikely, while in the likely case is_swbp_at_addr() adds the
      extra get_user_pages() even if the caller is uprobe_mmap(current->mm)
      and returns false.
      
      Note also that the semantics/usage of is_swbp_at_addr() in uprobe.c
      is confusing. set_swbp() uses it to detect the case when this insn
      was already modified by uprobes, that is why it should always compare
      the opcode with UPROBE_SWBP_INSN even if the hardware (like powerpc)
      has other trap insns. It doesn't matter if this breakpoint was in fact
      installed by gdb or application itself, we are going to "steal" this
      breakpoint anyway and execute the original insn from vm_file even if
      it no longer matches the memory.
      
      OTOH, handle_swbp()->find_active_uprobe() uses is_swbp_at_addr() to
      figure out whether we need to send SIGTRAP or not if we can not find
      uprobe, so in this case it should return true for all trap variants,
      not only for UPROBE_SWBP_INSN.
      
      This patch removes set_swbp()->is_swbp_at_addr(), the next patches
      will remove it from set_orig_insn() which is similar to set_swbp()
      in this respect. So the only caller will be handle_swbp() and we
      can make its semantics clear.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      e97f65a1
    • O
      uprobes: Restrict valid_vma(false) to skip VM_SHARED vmas · e40cfce6
      Oleg Nesterov 提交于
      valid_vma(false) ignores ->vm_flags, this is not actually right.
      We should never try to write into MAP_SHARED mapping, this can
      confuse an apllication which actually writes to ->vm_file.
      
      With this patch valid_vma(false) ignores VM_WRITE only but checks
      other (immutable) bits checked by valid_vma(true). This can also
      speedup uprobe_munmap() and uprobe_unregister().
      
      Note: even after this patch _unregister can confuse the probed
      application if it does mprotect(PROT_WRITE) after _register and
      installs "int3", but this is hardly possible to avoid and this
      doesn't differ from gdb case.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      e40cfce6
    • O
      uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC · 78a32054
      Oleg Nesterov 提交于
      uprobe_register() or uprobe_mmap() requires VM_READ | VM_EXEC, this
      is not right. An apllication can do mprotect(PROT_EXEC) later and
      execute this code.
      
      Change valid_vma(is_register => true) to check VM_MAYEXEC instead.
      No need to check VM_MAYREAD, it is always set.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      78a32054