-
由 Thomas Gleixner 提交于
Profiling perf with perf revealed that a large part of the processing time is spent in malloc/memcpy/free in the sample ordering code. That code copies the data from the mmap into malloc'ed memory. That's silly. We can keep the mmap and just store the pointer in the queuing data structure. For 64 bit this is not a problem as we map the whole file anyway. On 32bit we keep 8 maps around and unmap the oldest before mmaping the next chunk of the file. Performance gain: 2.95s -> 1.23s (Faktor 2.4) Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20101130163820.278787719@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
fe174207