1. 31 7月, 2010 2 次提交
    • A
      perf tools: Release session and symbol resources on exit · d65a458b
      Arnaldo Carvalho de Melo 提交于
      So that we reduce the noise when looking for leaks using tools such as
      valgrind.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d65a458b
    • A
      perf tools: Release thread resources on PERF_RECORD_EXIT · 591765fd
      Arnaldo Carvalho de Melo 提交于
      For long running sessions with many threads with short lifetimes the
      amount of memory that the buildid process takes is too much.
      
      Since we don't have hist_entries that may be pointing to them, we can
      just release the resources associated with each thread when the exit
      (PERF_RECORD_EXIT) event is received.
      
      For normal processing we need to annotate maps with hits, and thus
      hist_entries pointing to it and drop the ones that had none. Will be
      done in a followup patch.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      591765fd
  2. 30 7月, 2010 7 次提交
  3. 28 7月, 2010 1 次提交
  4. 27 7月, 2010 7 次提交
    • D
      perf tools: Remove unneeded code for tracking the cwd in perf sessions · 88ca895d
      Dave Martin 提交于
      Tidy-up patch to remove some code and struct perf_session data members
      which are no longer needed due to the previous patch: "perf tools: Don't
      abbreviate file paths relative to the cwd".
      
      LKML-Reference: <new-submission>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88ca895d
    • D
      perf report: Don't abbreviate file paths relative to the cwd · 361d1346
      Dave Martin 提交于
      This avoids around some problems where the full path is executables and DSOs it
      needed for finding debug symbols on platforms with separated debug symbol files
      such as Ubuntu.  This is simpler than tracking an extra name for each image.
      
      The only impact should be that paths in verbose output from the perf tools
      become absolute, instead of relative to .
      
      LKML-Reference: <new-submission>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361d1346
    • A
      perf ui: New hists tree widget · 0f0cbf7a
      Arnaldo Carvalho de Melo 提交于
      The stock newt checkbox tree widget we were using was not really
      suitable for hist entry + callchain browsing.
      
      The problems with it were manifold:
      
      - We needed to traverse the whole hist_entry rb_tree to add each entry +
        callchains beforehand.
      
      - No control over the colors used for each row
      
      So a new tree widget, based mostly on slang, was written.
      
      It extends the ui_browser class already used for annotate to allow the
      user to fold/unfold branches in the callchains tree, using extra fields
      in the symbol_map class that is embedded in hist_entry and
      callchain_node instances to store the folding state and when changing
      this state calculates the number of rows that are produced when showing
      a particular hist_entry instance.
      
      This greatly speeds up browsing as we don't have to upfront touch all
      the entries and only calculate callchain related operations when some
      callchain branch is actually unfolded.
      
      The memory footprint is also reduced as the data structure is not
      duplicated, just some extra fields for controling callchain state and to
      simplify the process of seeking thru entries (nr_rows, row_offset) were
      added.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0f0cbf7a
    • A
      perf ui: Show the scroll bar over the left window frame · 8d8c369f
      Arnaldo Carvalho de Melo 提交于
      So that we gain two columns and look more like classical (at least in
      TUIs) scroll bars bars.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d8c369f
    • A
      perf ui: Consider the refreshed dimensions in ui_browser__show · 63160f73
      Arnaldo Carvalho de Melo 提交于
      When we call ui_browser__show we may have called
      ui_browser__refresh_dimensions to check if the maximum lenght for the
      contained entries changed, such as when zooming in and out DSOs or
      threads in the hist browser.
      
      For that to happen we must delete the old form, that will take care of
      deleting the vertical scrollbar, etc, and then recreate them, with the
      new dimensions.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63160f73
    • A
      perf hist: Introduce routine to measure lenght of formatted entry · 06daaaba
      Arnaldo Carvalho de Melo 提交于
      Will be used to figure out the window width needed in the new tree
      widget.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      06daaaba
    • A
      perf ui: Restore SPACE as an alias to PGDN in annotate · b61b55ed
      Arnaldo Carvalho de Melo 提交于
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b61b55ed
  5. 23 7月, 2010 5 次提交
  6. 22 7月, 2010 8 次提交
  7. 21 7月, 2010 10 次提交
    • M
      tracing/documentation: Document dynamic ftracer internals · 9849ed4d
      Mike Frysinger 提交于
      Add more details to the dynamic function tracing design implementation.
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      LKML-Reference: <1279610015-10250-1-git-send-email-vapier@gentoo.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9849ed4d
    • K
      tracing: Shrink max latency ringbuffer if unnecessary · ef710e10
      KOSAKI Motohiro 提交于
      Documentation/trace/ftrace.txt says
      
        buffer_size_kb:
      
              This sets or displays the number of kilobytes each CPU
              buffer can hold. The tracer buffers are the same size
              for each CPU. The displayed number is the size of the
              CPU buffer and not total size of all buffers. The
              trace buffers are allocated in pages (blocks of memory
              that the kernel uses for allocation, usually 4 KB in size).
              If the last page allocated has room for more bytes
              than requested, the rest of the page will be used,
              making the actual allocation bigger than requested.
              ( Note, the size may not be a multiple of the page size
                due to buffer management overhead. )
      
              This can only be updated when the current_tracer
              is set to "nop".
      
      But it's incorrect. currently total memory consumption is
      'buffer_size_kb x CPUs x 2'.
      
      Why two times difference is there? because ftrace implicitly allocate
      the buffer for max latency too.
      
      That makes sad result when admin want to use large buffer. (If admin
      want full logging and makes detail analysis). example, If admin
      have 24 CPUs machine and write 200MB to buffer_size_kb, the system
      consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
      usually unacceptable.
      
      Fortunatelly, almost all users don't use max latency feature.
      The max latency buffer can be disabled easily.
      
      This patch shrink buffer size of the max latency buffer if
      unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20100701104554.DA2D.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ef710e10
    • P
      pcmcia: fix 'driver ... did not release config properly' warning · 418c5278
      Patrick McHardy 提交于
      Up to 2.6.34 pcmcia_release_irq() reset p_dev->_irq to 0 after releasing
      the irq.  The IRQ is now released in pcmcia_disable_device(), however
      p_dev->_irq is not reset, triggering a warning in pcmcia_device_remove().
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      418c5278
    • D
      mm: add context argument to shrinker callback to remaining shrinkers · 567c7b0e
      Dave Chinner 提交于
      Add the shrinkers missed in the first conversion of the API in
      commit 7f8275d0 ("mm: add context argument to
      shrinker callback").
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      567c7b0e
    • L
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan 提交于
      __print_flags() and __print_symbolic() use percpu trace_seq:
      
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bc289ae9
    • R
      trace: Reorder struct ring_buffer_per_cpu to remove padding on 64bit · 985023de
      Richard Kennedy 提交于
      Reorder structure to remove 8 bytes of padding on 64 bit builds.
      This shrinks the size to 128 bytes so allowing allocation from a smaller
      slab & needed one fewer cache lines.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      LKML-Reference: <1269516456.2054.8.camel@localhost>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      985023de
    • L
      tracing: Allow to disable cmdline recording · e870e9a1
      Li Zefan 提交于
      We found that even enabling a single trace event that will rarely be
      triggered can add big overhead to context switch.
      
      (lmbench context switch test)
       -------------------------------------------------
       2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
       ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
      ------ ------ ------ ------ ------ ------- -------
        2.19   2.3   2.21   2.56   2.13     2.54    2.07
        2.39   2.51  2.35   2.75   2.27     2.81    2.24
      
      The overhead is 6% ~ 11%.
      
      It's because when a trace event is enabled 3 tracepoints (sched_switch,
      sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.
      
      We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
      to allow to disable cmdline recording.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e870e9a1
    • M
      math-emu: correct test for downshifting fraction in _FP_FROM_INT() · f8324e20
      Mikael Pettersson 提交于
      The kernel's math-emu code contains a macro _FP_FROM_INT() which is
      used to convert an integer to a raw normalized floating-point value.
      It does this basically in three steps:
      
      1. Compute the exponent from the number of leading zero bits.
      2. Downshift large fractions to put the MSB in the right position
         for normalized fractions.
      3. Upshift small fractions to put the MSB in the right position.
      
      There is an boundary error in step 2, causing a fraction with its
      MSB exactly one bit above the normalized MSB position to not be
      downshifted.  This results in a non-normalized raw float, which when
      packed becomes a massively inaccurate representation for that input.
      
      The impact of this depends on a number of arch-specific factors,
      but it is known to have broken emulation of FXTOD instructions
      on UltraSPARC III, which was originally reported as GCC bug 44631
      <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.
      
      Any arch which uses math-emu to emulate conversions from integers to
      same-size floats may be affected.
      
      The fix is simple: the exponent comparison used to determine if the
      fraction should be downshifted must be "<=" not "<".
      
      I'm sending a kernel module to test this as a reply to this message.
      There are also SPARC user-space test cases in the GCC bug entry.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8324e20
    • L
      Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · f4b23cc2
      Linus Torvalds 提交于
      * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        drm/r600: fix possible NULL pointer derefernce
        drm/radeon/kms: add quirk for ASUS HD 3600 board
        include/linux/vgaarb.h: add missing part of include guard
        drm/nouveau: Fix crashes during fbcon init on single head cards.
        drm/nouveau: fix pcirom vbios shadow breakage from acpi rom patch
        drm/radeon/kms: fix shared ddc harder
        drm/i915: enable low power render writes on GEN3 hardware.
        drm/i915: Define MI_ARB_STATE bits
        vmwgfx: return -EFAULT if copy_to_user fails
        fb: handle allocation failure in alloc_apertures()
        drm: radeon: check kzalloc() result
        drm/ttm: Fix build on architectures without AGP
        drm/radeon/kms: fix gtt MC base alignment on rs4xx/rs690/rs740 asics
        drm/radeon/kms: fix possible mis-detection of sideport on rs690/rs740
        drm/radeon/kms: fix legacy tv-out pal mode
      f4b23cc2
    • A