• F
    perf callchain: Feed callchains into a cursor · 1b3a0e95
    Frederic Weisbecker 提交于
    The callchains are fed with an array of a fixed size.
    As a result we iterate over each callchains three times:
    
    - 1st to resolve symbols
    - 2nd to filter out context boundaries
    - 3rd for the insertion into the tree
    
    This also involves some pairs of memory allocation/deallocation
    everytime we insert a callchain, for the filtered out array of
    addresses and for the array of symbols that comes along.
    
    Instead, feed the callchains through a linked list with persistent
    allocations. It brings several pros like:
    
    - Merge the 1st and 2nd iterations in one. That was possible before
    but in a way that would involve allocating an array slightly taller
    than necessary because we don't know in advance the number of context
    boundaries to filter out.
    
    - Much lesser allocations/deallocations. The linked list keeps
    persistent empty entries for the next usages and is extendable at
    will.
    
    - Makes it easier for multiple sources of callchains to feed a
    stacktrace together. This is deemed to pave the way for cfi based
    callchains wherein traditional frame pointer based kernel
    stacktraces will precede cfi based user ones, producing an overall
    callchain which size is hardly predictable. This requirement
    makes the static array obsolete and makes a linked list based
    iterator a much more flexible fit.
    
    Basic testing on a big perf file containing callchains (~ 176 MB)
    has shown a throughput gain of about 11% with perf report.
    
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
    Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
    Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
    1b3a0e95
callchain.c 10.4 KB