1. 12 1月, 2014 1 次提交
    • S
      perf/x86: Fix active_entry initialization · f3ae75de
      Stephane Eranian 提交于
      This patch fixes a problem with the initialization of the
      struct perf_event active_entry field. It is defined inside
      an anonymous union and was initialized in perf_event_alloc()
      using INIT_LIST_HEAD(). However at that time, we do not know
      whether the event is going to use active_entry or hlist_entry (SW).
      Or at last, we don't want to make that determination there.
      The problem is that hlist and list_head are not initialized
      the same way. One is okay with NULL (from kzmalloc), the other
      needs to pointers to point to self.
      
      This patch resolves this problem by dropping the union.
      This will avoid problems later on, if someone starts using
      active_entry or hlist_entry without verifying that they
      actually overlap. This also solves the initialization
      problem.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: ak@linux.intel.com
      Cc: acme@redhat.com
      Cc: jolsa@redhat.com
      Cc: zheng.z.yan@intel.com
      Cc: bp@alien8.de
      Cc: vincent.weaver@maine.edu
      Cc: maria.n.dimakopoulou@gmail.com
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1389176153-3128-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3ae75de
  2. 17 12月, 2013 1 次提交
  3. 11 12月, 2013 1 次提交
  4. 27 11月, 2013 1 次提交
  5. 20 11月, 2013 4 次提交
  6. 19 11月, 2013 1 次提交
  7. 13 11月, 2013 1 次提交
  8. 10 11月, 2013 2 次提交
    • O
      uprobes: Fix the memory out of bound overwrite in copy_insn() · 2ded0980
      Oleg Nesterov 提交于
      1. copy_insn() doesn't look very nice, all calculations are
         confusing and it is not immediately clear why do we read
         the 2nd page first.
      
      2. The usage of inode->i_size is wrong on 32-bit machines.
      
      3. "Instruction at end of binary" logic is simply wrong, it
         doesn't handle the case when uprobe->offset > inode->i_size.
      
         In this case "bytes" overflows, and __copy_insn() writes to
         the memory outside of uprobe->arch.insn.
      
         Yes, uprobe_register() checks i_size_read(), but this file
         can be truncated after that. All i_size checks are racy, we
         do this only to catch the obvious mistakes.
      
      Change copy_insn() to call __copy_insn() in a loop, simplify
      and fix the bytes/nbytes calculations.
      
      Note: we do not care if we read extra bytes after inode->i_size
      if we got the valid page. This is fine because the task gets the
      same page after page-fault, and arch_uprobe_analyze_insn() can't
      know how many bytes were actually read anyway.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      2ded0980
    • O
      uprobes: Fix the wrong usage of current->utask in uprobe_copy_process() · 70d7f987
      Oleg Nesterov 提交于
      Commit aa59c53f "uprobes: Change uprobe_copy_process() to dup
      xol_area" has a stupid typo, we need to setup t->utask->vaddr but
      the code wrongly uses current->utask.
      
      Even with this bug dup_xol_work() works "in practice", but only
      because get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE) likely
      returns the same address every time.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      70d7f987
  9. 07 11月, 2013 3 次提交
    • O
      uprobes: Export write_opcode() as uprobe_write_opcode() · f72d41fa
      Oleg Nesterov 提交于
      set_swbp() and set_orig_insn() are __weak, but this is pointless
      because write_opcode() is static.
      
      Export write_opcode() as uprobe_write_opcode() for the upcoming
      arm port, this way it can actually override set_swbp() and use
      __opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      f72d41fa
    • O
      uprobes: Introduce arch_uprobe->ixol · 8a8de66c
      Oleg Nesterov 提交于
      Currently xol_get_insn_slot() assumes that we should simply copy
      arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
      just the copy of the original insn.
      
      This is not true for arm which needs to create another insn to
      execute it out-of-line.
      
      So this patch simply adds the new member, ->ixol into the union.
      This doesn't make any difference for x86 and powerpc, but arm
      can divorce insn/ixol and initialize the correct xol insn in
      arch_uprobe_analyze_insn().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      8a8de66c
    • O
      uprobes: Kill module_init() and module_exit() · 736e89d9
      Oleg Nesterov 提交于
      Turn module_init() into __initcall() and kill module_exit().
      
      This code can't be compiled as a module so these module_*()
      calls only add the confusion, especially if arch-dependant
      code needs its own initialization hooks.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      736e89d9
  10. 06 11月, 2013 8 次提交
  11. 30 10月, 2013 6 次提交
    • O
      uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK · 3ab67966
      Oleg Nesterov 提交于
      uprobe_copy_process() does nothing if the child shares ->mm with
      the forking process, but there is a special case: CLONE_VFORK.
      In this case it would be more correct to do dup_utask() but avoid
      dup_xol(). This is not that important, the child should not unwind
      its stack too much, this can corrupt the parent's stack, but at
      least we need this to allow to ret-probe __vfork() itself.
      
      Note: in theory, it would be better to check task_pt_regs(p)->sp
      instead of CLONE_VFORK, we need to dup_utask() if and only if the
      child can return from the function called by the parent. But this
      needs the arch-dependant helper, and I think that nobody actually
      does clone(same_stack, CLONE_VM).
      Reported-by: NMartin Cermak <mcermak@redhat.com>
      Reported-by: NDavid Smith <dsmith@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      3ab67966
    • O
      uprobes: Change uprobe_copy_process() to dup xol_area · aa59c53f
      Oleg Nesterov 提交于
      This finally fixes the serious bug in uretprobes: a forked child
      crashes if the parent called fork() with the pending ret probe.
      
      Trivial test-case:
      
      	# perf probe -x /lib/libc.so.6 __fork%return
      	# perf record -e probe_libc:__fork perl -le 'fork || print "OK"'
      
      (the child doesn't print "OK", it is killed by SIGSEGV)
      
      If the child returns from the probed function it actually returns
      to trampoline_vaddr, because it got the copy of parent's stack
      mangled by prepare_uretprobe() when the parent entered this func.
      
      It crashes because a) this address is not mapped and b) until the
      previous change it doesn't have the proper->return_instances info.
      
      This means that uprobe_copy_process() has to create xol_area which
      has the trampoline slot, and its vaddr should be equal to parent's
      xol_area->vaddr.
      
      Unfortunately, uprobe_copy_process() can not simply do
      __create_xol_area(child, xol_area->vaddr). This could actually work
      but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
      we offload this to task_work_run(), and pass the argument via not
      yet used utask->vaddr.
      
      We know that this vaddr is fine for install_special_mapping(), the
      necessary hole was recently "created" by dup_mmap() which skips the
      parent's VM_DONTCOPY area, and nobody else could use the new mm.
      
      Unfortunately, this also means that we can not handle the errors
      properly, we obviously can not abort the already completed fork().
      So we simply print the warning if GFP_KERNEL allocation (the only
      possible reason) fails.
      Reported-by: NMartin Cermak <mcermak@redhat.com>
      Reported-by: NDavid Smith <dsmith@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      aa59c53f
    • O
      uprobes: Change uprobe_copy_process() to dup return_instances · 248d3a7b
      Oleg Nesterov 提交于
      uprobe_copy_process() assumes that the new child doesn't need
      ->utask, it should be allocated by demand.
      
      But this is not true if the forking task has the pending ret-
      probes, the child should report them as well and thus it needs
      the copy of parent's ->return_instances chain. Otherwise the
      child crashes when it returns from the probed function.
      
      Alternatively we could cleanup the child's stack, but this needs
      per-arch changes and this is not what we want. At least systemtap
      expects a .return in the child too.
      
      Note: this change alone doesn't fix the problem, see the next
      change.
      Reported-by: NMartin Cermak <mcermak@redhat.com>
      Reported-by: NDavid Smith <dsmith@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      248d3a7b
    • O
      uprobes: Teach __create_xol_area() to accept the predefined vaddr · af0d95af
      Oleg Nesterov 提交于
      Currently xol_add_vma() uses get_unmapped_area() for area->vaddr,
      but the next patches need to use the fixed address. So this patch
      adds the new "vaddr" argument to __create_xol_area() which should
      be used as area->vaddr if it is nonzero.
      
      xol_add_vma() doesn't bother to verify that the predefined addr is
      not used, insert_vm_struct() should fail if find_vma_links() detects
      the overlap with the existing vma.
      
      Also, __create_xol_area() doesn't need __GFP_ZERO to allocate area.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      af0d95af
    • O
      uprobes: Introduce __create_xol_area() · 6441ec8b
      Oleg Nesterov 提交于
      No functional changes, preparation.
      
      Extract the code which actually allocates/installs the new area
      into the new helper, __create_xol_area().
      
      While at it remove the unnecessary "ret = ENOMEM" and "ret = 0"
      in xol_add_vma(), they both have no effect.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      6441ec8b
    • O
      uprobes: Change the callsite of uprobe_copy_process() · b68e0749
      Oleg Nesterov 提交于
      Preparation for the next patches.
      
      Move the callsite of uprobe_copy_process() in copy_process() down
      to the succesfull return. We do not care if copy_process() fails,
      uprobe_free_utask() won't be called in this case so the wrong
      ->utask != NULL doesn't matter.
      
      OTOH, with this change we know that copy_process() can't fail when
      uprobe_copy_process() is called, the new task should either return
      to user-mode or call do_exit(). This way uprobe_copy_process() can:
      
      	1. setup p->utask != NULL if necessary
      
      	2. setup uprobes_state.xol_area
      
      	3. use task_work_add(p)
      
      Also, move the definition of uprobe_copy_process() down so that it
      can see get_utask().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      b68e0749
  12. 29 10月, 2013 6 次提交
  13. 18 10月, 2013 1 次提交
    • S
      perf: Disable PERF_RECORD_MMAP2 support · 3090ffb5
      Stephane Eranian 提交于
      For now, we disable the extended MMAP record support (MMAP2).
      
      We have identified cases where it would not report the correct mapping
      information, clone(VM_CLONE) but with separate pids.  We will revisit
      the support once we find a solution for this case.
      
      The patch changes the kernel to return EINVAL if attr->mmap2 is set. The
      patch also modifies the perf tool to use regular PERF_RECORD_MMAP for
      synthetic events and it also prevents the tool from requesting
      attr->mmap2 mode because the kernel would reject it.
      
      The support will be revisited once the kenrel interface is updated.
      
      In V2, we reduce the patch to the strict minimum.
      
      In V3, we avoid calling perf_event_open() with mmap2 set because we know
      it will fail and require fallback retry.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20131017173215.GA8820@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3090ffb5
  14. 04 10月, 2013 3 次提交
    • A
      perf: Add generic transaction flags · fdfbbd07
      Andi Kleen 提交于
      Add a generic qualifier for transaction events, as a new sample
      type that returns a flag word. This is particularly useful
      for qualifying aborts: to distinguish aborts which happen
      due to asynchronous events (like conflicts caused by another
      CPU) versus instructions that lead to an abort.
      
      The tuning strategies are very different for those cases,
      so it's important to distinguish them easily and early.
      
      Since it's inconvenient and inflexible to filter for this
      in the kernel we report all the events out and allow
      some post processing in user space.
      
      The flags are based on the Intel TSX events, but should be fairly
      generic and mostly applicable to other HTM architectures too. In addition
      to various flag words there's also reserved space to report an
      program supplied abort code. For TSX this is used to distinguish specific
      classes of aborts, like a lock busy abort when doing lock elision.
      
      Flags:
      
      Elision and generic transactions 		   (ELISION vs TRANSACTION)
      (HLE vs RTM on TSX; IBM etc.  would likely only use TRANSACTION)
      Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
      Retryable transaction				   (RETRY)
      Conflicts with other threads			   (CONFLICT)
      Transaction write capacity overflow		   (CAPACITY WRITE)
      Transaction read capacity overflow		   (CAPACITY READ)
      
      Transactions implicitely aborted can also return an abort code.
      This can be used to signal specific events to the profiler. A common
      case is abort on lock busy in a RTM eliding library (code 0xff)
      To handle this case we include the TSX abort code
      
      Common example aborts in TSX would be:
      
      - Data conflict with another thread on memory read.
                                            Flags: TRANSACTION|ASYNC|CONFLICT
      - executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
      - HLE transaction in user space is too large
                                            Flags: ELISION|SYNC|CAPACITY-WRITE
      
      The only flag that is somewhat TSX specific is ELISION.
      
      This adds the perf core glue needed for reporting the new flag word out.
      
      v2: Add MEM/MISC
      v3: Move transaction to the end
      v4: Separate capacity-read/write and remove misc
      v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
          transaction to txn
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fdfbbd07
    • K
      perf: Enforce 1 as lower limit for perf_event_max_sample_rate · 723478c8
      Knut Petersen 提交于
      /proc/sys/kernel/perf_event_max_sample_rate will accept
      negative values as well as 0.
      
      Negative values are unreasonable, and 0 causes a
      divide by zero exception in perf_proc_update_handler.
      
      This patch enforces a lower limit of 1.
      Signed-off-by: NKnut Petersen <Knut_Petersen@t-online.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      723478c8
    • P
      perf: Fix perf_pmu_migrate_context · 9886167d
      Peter Zijlstra 提交于
      While auditing the list_entry usage due to a trinity bug I found that
      perf_pmu_migrate_context violates the rules for
      perf_event::event_entry.
      
      The problem is that perf_event::event_entry is a RCU list element, and
      hence we must wait for a full RCU grace period before re-using the
      element after deletion.
      
      Therefore the usage in perf_pmu_migrate_context() which re-uses the
      entry immediately is broken. For now introduce another list_head into
      perf_event for this specific usage.
      
      This doesn't actually fix the trinity report because that never goes
      through this code.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9886167d
  15. 27 9月, 2013 1 次提交