1. 15 2月, 2011 1 次提交
  2. 29 10月, 2010 1 次提交
  3. 25 8月, 2010 1 次提交
    • R
      oprofile: fix crash when accessing freed task structs · 750d857c
      Robert Richter 提交于
      This patch fixes a crash during shutdown reported below. The crash is
      caused by accessing already freed task structs. The fix changes the
      order for registering and unregistering notifier callbacks.
      
      All notifiers must be initialized before buffers start working. To
      stop buffer synchronization we cancel all workqueues, unregister the
      notifier callback and then flush all buffers. After all of this we
      finally can free all tasks listed.
      
      This should avoid accessing freed tasks.
      
      On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:
      
      > So the initial observation is a spinlock bad magic followed by a crash
      > in the spinlock debug code:
      >
      > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
      > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
      >
      > Backtrace looks like:
      >
      >       spin_bug+0x74/0xd4
      >       ._raw_spin_lock+0x48/0x184
      >       ._spin_lock+0x10/0x24
      >       .get_task_mm+0x28/0x8c
      >       .sync_buffer+0x1b4/0x598
      >       .wq_sync_buffer+0xa0/0xdc
      >       .worker_thread+0x1d8/0x2a8
      >       .kthread+0xa8/0xb4
      >       .kernel_thread+0x54/0x70
      >
      > So we are accessing a freed task struct in the work queue when
      > processing the samples.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: stable@kernel.org
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      750d857c
  4. 04 5月, 2010 1 次提交
  5. 23 4月, 2010 1 次提交
    • A
      oprofile: remove double ring buffering · cb6e943c
      Andi Kleen 提交于
      oprofile used a double buffer scheme for its cpu event buffer
      to avoid races on reading with the old locked ring buffer.
      
      But that is obsolete now with the new ring buffer, so simply
      use a single buffer. This greatly simplifies the code and avoids
      a lot of sample drops on large runs, especially with call graph.
      
      Based on suggestions from Steven Rostedt
      
      For stable kernels from v2.6.32, but not earlier.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: stable <stable@kernel.org>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      cb6e943c
  6. 01 4月, 2010 1 次提交
    • S
      ring-buffer: Add place holder recording of dropped events · 66a8cb95
      Steven Rostedt 提交于
      Currently, when the ring buffer drops events, it does not record
      the fact that it did so. It does inform the writer that the event
      was dropped by returning a NULL event, but it does not put in any
      place holder where the event was dropped.
      
      This is not a trivial thing to add because the ring buffer mostly
      runs in overwrite (flight recorder) mode. That is, when the ring
      buffer is full, new data will overwrite old data.
      
      In a produce/consumer mode, where new data is simply dropped when
      the ring buffer is full, it is trivial to add the placeholder
      for dropped events. When there's more room to write new data, then
      a special event can be added to notify the reader about the dropped
      events.
      
      But in overwrite mode, any new write can overwrite events. A place
      holder can not be inserted into the ring buffer since there never
      may be room. A reader could also come in at anytime and miss the
      placeholder.
      
      Luckily, the way the ring buffer works, the read side can find out
      if events were lost or not, and how many events. Everytime a write
      takes place, if it overwrites the header page (the next read) it
      updates a "overrun" variable that keeps track of the number of
      lost events. When a reader swaps out a page from the ring buffer,
      it can record this number, perfom the swap, and then check to
      see if the number changed, and take the diff if it has, which would be
      the number of events dropped. This can be stored by the reader
      and returned to callers of the reader.
      
      Since the reader page swap will fail if the writer moved the head
      page since the time the reader page set up the swap, this gives room
      to record the overruns without worrying about races. If the reader
      sets up the pages, records the overrun, than performs the swap,
      if the swap succeeds, then the overrun variable has not been
      updated since the setup before the swap.
      
      For binary readers of the ring buffer, a flag is set in the header
      of each sub page (sub buffer) of the ring buffer. This flag is embedded
      in the size field of the data on the sub buffer, in the 31st bit (the size
      can be 32 or 64 bits depending on the architecture), but only 27
      bits needs to be used for the actual size (less actually).
      
      We could add a new field in the sub buffer header to also record the
      number of events dropped since the last read, but this will change the
      format of the binary ring buffer a bit too much. Perhaps this change can
      be made if the information on the number of events dropped is considered
      important enough.
      
      Note, the notification of dropped events is only used by consuming reads
      or peeking at the ring buffer. Iterating over the ring buffer does not
      keep this information because the necessary data is only available when
      a page swap is made, and the iterator does not swap out pages.
      
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      66a8cb95
  7. 29 10月, 2009 1 次提交
    • T
      percpu: make percpu symbols in oprofile unique · b3e9f672
      Tejun Heo 提交于
      This patch updates percpu related symbols in oprofile such that percpu
      symbols are unique and don't clash with local symbols.  This serves
      two purposes of decreasing the possibility of global percpu symbol
      collision and allowing dropping per_cpu__ prefix from percpu symbols.
      
      * drivers/oprofile/cpu_buffer.c: s/cpu_buffer/op_cpu_buffer/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRobert Richter <robert.richter@amd.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      b3e9f672
  8. 12 6月, 2009 1 次提交
  9. 11 6月, 2009 1 次提交
  10. 07 5月, 2009 1 次提交
  11. 06 2月, 2009 1 次提交
  12. 18 1月, 2009 1 次提交
  13. 08 1月, 2009 11 次提交
  14. 30 12月, 2008 4 次提交
  15. 29 12月, 2008 2 次提交
  16. 17 12月, 2008 1 次提交
  17. 11 12月, 2008 2 次提交
    • R
      oprofile: fix lost sample counter · 211117ff
      Robert Richter 提交于
      The number of lost samples could be greater than the number of
      received samples. This patches fixes this. The implementation
      introduces return values for add_sample() and add_code().
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      211117ff
    • R
      oprofile: remove nr_available_slots() · 1d7503b5
      Robert Richter 提交于
      This function is no longer available after the port to the new ring
      buffer. Its removal can lead to incomplete sampling sequences since
      IBS samples and backtraces are transfered in multiple samples. Due to
      a full buffer, samples could be lost any time. The userspace daemon
      has to live with such incomplete sampling sequences as long as the
      data within one sample is consistent.
      
      This will be fixed by changing the internal buffer data there all data
      of one IBS sample or a backtrace is packed in a single ring buffer
      entry. This is possible since the new ring buffer supports variable
      data size.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      1d7503b5
  18. 10 12月, 2008 6 次提交
    • R
      oprofile: port to the new ring_buffer · 6dad828b
      Robert Richter 提交于
      This patch replaces the current oprofile cpu buffer implementation
      with the ring buffer provided by the tracing framework. The motivation
      here is to leave the pain of implementing ring buffers to others. Oh,
      no, there are more advantages. Main reason is the support of different
      sample sizes that could be stored in the buffer. Use cases for this
      are IBS and Cell spu profiling. Using the new ring buffer ensures
      valid and complete samples and allows copying the cpu buffer stateless
      without knowing its content. Second it will use generic kernel API and
      also reduce code size. And hopefully, there are less bugs.
      
      Since the new tracing ring buffer implementation uses spin locks to
      protect the buffer during read/write access, it is difficult to use
      the buffer in an NMI handler. In this case, writing to the buffer by
      the NMI handler (x86) could occur also during critical sections when
      reading the buffer. To avoid this, there are 2 buffers for independent
      read and write access. Read access is in process context only, write
      access only in the NMI handler. If the read buffer runs empty, both
      buffers are swapped atomically. There is potentially a small window
      during swapping where the buffers are disabled and samples could be
      lost.
      
      Using 2 buffers is a little bit overhead, but the solution is clear
      and does not require changes in the ring buffer implementation. It can
      be changed to a single buffer solution when the ring buffer access is
      implemented as non-locking atomic code.
      
      The new buffer requires more size to store the same amount of samples
      because each sample includes an u32 header. Also, there is more code
      to execute for buffer access. Nonetheless, the buffer implementation
      is proven in the ftrace environment and worth to use also in oprofile.
      
      Patches that changes the internal IBS buffer usage will follow.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      6dad828b
    • R
      oprofile: moving cpu_buffer_reset() to cpu_buffer.h · fbc9bf9f
      Robert Richter 提交于
      This is in preparation for changes in the cpu buffer implementation.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      fbc9bf9f
    • R
      oprofile: adding cpu_buffer_write_commit() · 229234ae
      Robert Richter 提交于
      This is in preparation for changes in the cpu buffer implementation.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      229234ae
    • R
      oprofile: adding cpu buffer r/w access functions · 7d468abe
      Robert Richter 提交于
      This is in preparation for changes in the cpu buffer implementation.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      7d468abe
    • R
      oprofile: whitspace changes only · cdc1834d
      Robert Richter 提交于
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      cdc1834d
    • R
      oprofile: comment cleanup · fd13f6c8
      Robert Richter 提交于
      This fixes the coding style of some comments.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      fd13f6c8
  19. 21 10月, 2008 1 次提交
    • C
      powerpc/oprofile: Fix mutex locking for cell spu-oprofile · a5598ca0
      Carl Love 提交于
      The issue is the SPU code is not holding the kernel mutex lock while
      adding samples to the kernel buffer.
      
      This patch creates per SPU buffers to hold the data.  Data
      is added to the buffers from in interrupt context.  The data
      is periodically pushed to the kernel buffer via a new Oprofile
      function oprofile_put_buff(). The oprofile_put_buff() function
      is called via a work queue enabling the funtion to acquire the
      mutex lock.
      
      The existing user controls for adjusting the per CPU buffer
      size is used to control the size of the per SPU buffers.
      Similarly, overflows of the SPU buffers are reported by
      incrementing the per CPU buffer stats.  This eliminates the
      need to have architecture specific controls for the per SPU
      buffers which is not acceptable to the OProfile user tool
      maintainer.
      
      The export of the oprofile add_event_entry() is removed as it
      is no longer needed given this patch.
      
      Note, this patch has not addressed the issue of indexing arrays
      by the spu number.  This still needs to be fixed as the spu
      numbering is not guarenteed to be 0 to max_num_spus-1.
      Signed-off-by: NCarl Love <carll@us.ibm.com>
      Signed-off-by: NMaynard Johnson <maynardj@us.ibm.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NAcked-by: Robert Richter <robert.richter@amd.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a5598ca0
  20. 17 10月, 2008 1 次提交