1. 31 1月, 2013 3 次提交
  2. 30 1月, 2013 1 次提交
    • S
      tracing/fgraph: Adjust fgraph depth before calling trace return callback · 03274a3f
      Steven Rostedt (Red Hat) 提交于
      While debugging the virtual cputime with the function graph tracer
      with a max_depth of 1 (most common use of the max_depth so far),
      I found that I was missing kernel execution because of a race condition.
      
      The code for the return side of the function has a slight race:
      
      	ftrace_pop_return_trace(&trace, &ret, frame_pointer);
      	trace.rettime = trace_clock_local();
      	ftrace_graph_return(&trace);
      	barrier();
      	current->curr_ret_stack--;
      
      The ftrace_pop_return_trace() initializes the trace structure for
      the callback. The ftrace_graph_return() uses the trace structure
      for its own use as that structure is on the stack and is local
      to this function. Then the curr_ret_stack is decremented which
      is what the trace.depth is set to.
      
      If an interrupt comes in after the ftrace_graph_return() but
      before the curr_ret_stack, then the called function will get
      a depth of 2. If max_depth is set to 1 this function will be
      ignored.
      
      The problem is that the trace has already been called, and the
      timestamp for that trace will not reflect the time the function
      was about to re-enter userspace. Calls to the interrupt will not
      be traced because the max_depth has prevented this.
      
      To solve this issue, the ftrace_graph_return() can safely be
      moved after the current->curr_ret_stack has been updated.
      This way the timestamp for the return callback will reflect
      the actual time.
      
      If an interrupt comes in after the curr_ret_stack update and
      ftrace_graph_return(), it will be traced. It may look a little
      confusing to see it within the other function, but at least
      it will not be lost.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03274a3f
  3. 29 1月, 2013 1 次提交
  4. 26 1月, 2013 2 次提交
  5. 25 1月, 2013 1 次提交
  6. 24 1月, 2013 16 次提交
  7. 23 1月, 2013 16 次提交
    • T
      ALSA: hda - Fix inconsistent pin states after resume · 31614bb8
      Takashi Iwai 提交于
      The commit [26a6cb6c: ALSA: hda - Implement a poll loop for jacks as a
      module parameter] introduced the polling jack detection code, but it
      also moved the call of snd_hda_jack_set_dirty_all() in the resume path
      after resume/init ops call.  This caused a regression when the jack
      state has been changed during power-down (e.g. in the power save
      mode).  Since the driver doesn't probe the new jack state but keeps
      using the cached value due to no dirty flag, the pin state remains
      also as if the jack is still plugged.
      
      The fix is simply moving snd_hda_jack_set_dirty_all() to the original
      position.
      Reported-by: NManolo Díaz <diaz.manolo@gmail.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      31614bb8
    • S
      ring-buffer: Remove trace.h from ring_buffer.c · 0b07436d
      Steven Rostedt 提交于
      ring_buffer.c use to require declarations from trace.h, but
      these have moved to the generic header files. There's nothing
      in trace.h that ring_buffer.c requires.
      
      There's some headers that trace.h included that ring_buffer.c
      needs, but it's best that it includes them directly, and not
      include trace.h.
      
      Also, some things may use ring_buffer.c without having tracing
      configured. This removes the dependency that may come in the
      future.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0b07436d
    • S
      ring-buffer: User context bit recursion checking · 567cd4da
      Steven Rostedt 提交于
      Using context bit recursion checking, we can help increase the
      performance of the ring buffer.
      
      Before this patch:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 10.285
      Time: 10.407
      Time: 10.243
      Time: 10.372
      Time: 10.380
      Time: 10.198
      Time: 10.272
      Time: 10.354
      Time: 10.248
      Time: 10.253
      
      (average: 10.3012)
      
      Now we have:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 9.712
      Time: 9.824
      Time: 9.861
      Time: 9.827
      Time: 9.962
      Time: 9.905
      Time: 9.886
      Time: 10.088
      Time: 9.861
      Time: 9.834
      
      (average: 9.876)
      
       a 4% savings!
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      567cd4da
    • S
      ftrace: Use only the preempt version of function tracing · 897f68a4
      Steven Rostedt 提交于
      The function tracer had two different versions of function tracing.
      
      The disabling of irqs version and the preempt disable version.
      
      As function tracing in very intrusive and can cause nasty recursion
      issues, it has its own recursion protection. But the old method to
      do this was a flat layer. If it detected that a recursion was happening
      then it would just return without recording.
      
      This made the preempt version (much faster than the irq disabling one)
      not very useful, because if an interrupt were to occur after the
      recursion flag was set, the interrupt would not be traced at all,
      because every function that was traced would think it recursed on
      itself (due to the context it preempted setting the recursive flag).
      
      Now that we have a recursion flag for every context level, we
      no longer need to worry about that. We can disable preemption,
      set the current context recursion check bit, and go on. If an
      interrupt were to come along, it would check its own context bit
      and happily continue to trace.
      
      As the preempt version is faster than the irq disable version,
      there's no more reason to keep the preempt version around.
      And the irq disable version still had an issue with missing
      out on tracing NMI code.
      
      Remove the irq disable function tracer version and have the
      preempt disable version be the default (and only version).
      
      Before this patch we had from running:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 12.028
      Time: 11.945
      Time: 11.925
      Time: 11.964
      Time: 12.002
      Time: 11.910
      Time: 11.944
      Time: 11.929
      Time: 11.941
      Time: 11.924
      
      (average: 11.9512)
      
      Now we have:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 10.285
      Time: 10.407
      Time: 10.243
      Time: 10.372
      Time: 10.380
      Time: 10.198
      Time: 10.272
      Time: 10.354
      Time: 10.248
      Time: 10.253
      
      (average: 10.3012)
      
       a 13.8% savings!
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      897f68a4
    • S
      tracing: Avoid unnecessary multiple recursion checks · edc15caf
      Steven Rostedt 提交于
      When function tracing occurs, the following steps are made:
        If arch does not support a ftrace feature:
         call internal function (uses INTERNAL bits) which calls...
        If callback is registered to the "global" list, the list
         function is called and recursion checks the GLOBAL bits.
         then this function calls...
        The function callback, which can use the FTRACE bits to
         check for recursion.
      
      Now if the arch does not suppport a feature, and it calls
      the global list function which calls the ftrace callback
      all three of these steps will do a recursion protection.
      There's no reason to do one if the previous caller already
      did. The recursion that we are protecting against will
      go through the same steps again.
      
      To prevent the multiple recursion checks, if a recursion
      bit is set that is higher than the MAX bit of the current
      check, then we know that the check was made by the previous
      caller, and we can skip the current check.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      edc15caf
    • S
      tracing: Make the trace recursion bits into enums · e46cbf75
      Steven Rostedt 提交于
      Convert the bits into enums which makes the code a little easier
      to maintain.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e46cbf75
    • S
      ftrace: Add context level recursion bit checking · c29f122c
      Steven Rostedt 提交于
      Currently for recursion checking in the function tracer, ftrace
      tests a task_struct bit to determine if the function tracer had
      recursed or not. If it has, then it will will return without going
      further.
      
      But this leads to races. If an interrupt came in after the bit
      was set, the functions being traced would see that bit set and
      think that the function tracer recursed on itself, and would return.
      
      Instead add a bit for each context (normal, softirq, irq and nmi).
      
      A check of which context the task is in is made before testing the
      associated bit. Now if an interrupt preempts the function tracer
      after the previous context has been set, the interrupt functions
      can still be traced.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c29f122c
    • S
      ftrace: Optimize the function tracer list loop · 0a016409
      Steven Rostedt 提交于
      There is lots of places that perform:
      
             op = rcu_dereference_raw(ftrace_control_list);
             while (op != &ftrace_list_end) {
      
      Add a helper macro to do this, and also optimize for a single
      entity. That is, gcc will optimize a loop for either no iterations
      or more than one iteration. But usually only a single callback
      is registered to the function tracer, thus the optimized case
      should be a single pass. to do this we now do:
      
      	op = rcu_dereference_raw(list);
      	do {
      		[...]
      	} while (likely(op = rcu_dereference_raw((op)->next)) &&
      	       unlikely((op) != &ftrace_list_end));
      
      An op is always registered (ftrace_list_end when no callbacks is
      registered), thus when a single callback is registered, the link
      list looks like:
      
       top => callback => ftrace_list_end => NULL.
      
      The likely(op = op->next) still must be performed due to the race
      of removing the callback, where the first op assignment could
      equal ftrace_list_end. In that case, the op->next would be NULL.
      But this is unlikely (only happens in a race condition when
      removing the callback).
      
      But it is very likely that the next op would be ftrace_list_end,
      unless more than one callback has been registered. This tells
      gcc what the most common case is and makes the fast path with
      the least amount of branches.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0a016409
    • S
      ftrace: Fix function tracing recursion self test · 9640388b
      Steven Rostedt 提交于
      The function tracing recursion self test should not crash
      the machine if the resursion test fails. If it detects that
      the function tracing is recursing when it should not be, then
      bail, don't go into an infinite recursive loop.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9640388b
    • S
      ftrace: Fix global function tracers that are not recursion safe · 63503794
      Steven Rostedt 提交于
      If one of the function tracers set by the global ops is not recursion
      safe, it can still be called directly without the added recursion
      supplied by the ftrace infrastructure.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      63503794
    • S
      tracing: Fix selftest function recursion accounting · 05cbbf64
      Steven Rostedt 提交于
      The test that checks function recursion does things differently
      if the arch does not support all ftrace features. But that really
      doesn't make a difference with how the test runs, and either way
      the count variable should be 2 at the end.
      
      Currently the test wrongly fails for archs that don't support all
      the ftrace features.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      05cbbf64
    • S
      tracing: Fix race with max_tr and changing tracers · 34600f0e
      Steven Rostedt 提交于
      There's a race condition between the setting of a new tracer and
      the update of the max trace buffers (the swap). When a new tracer
      is added, it sets current_trace to nop_trace before disabling
      the old tracer. At this moment, if the old tracer uses update_max_tr(),
      the update may trigger the warning against !current_trace->use_max-tr,
      as nop_trace doesn't have that set.
      
      As update_max_tr() requires that interrupts be disabled, we can
      add a check to see if current_trace == nop_trace and bail if it
      does. Then when disabling the current_trace, set it to nop_trace
      and run synchronize_sched(). This will make sure all calls to
      update_max_tr() have completed (it was called with interrupts disabled).
      
      As a clean up, this commit also removes shrinking and recreating
      the max_tr buffer if the old and new tracers both have use_max_tr set.
      The old way use to always shrink the buffer, and then expand it
      for the next tracer. This is a waste of time.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      34600f0e
    • L
      Revert "drivers/misc/ti-st: remove gpio handling" · a7e2ca17
      Luciano Coelho 提交于
      This reverts commit eccf2979.
      
      The reason is that it broke TI WiLink shared transport on Panda.
      Also, callback functions should not be added to board files anymore,
      so revert to implementing the power functions in the driver itself.
      
      Additionally, changed a variable name ('status' to 'err') so that this
      revert compiles properly.
      
      Cc: stable <stable@vger.kernel.org> [3.7]
      Acked-by: NTony Lindgren <tony@atomide.com>
      Signed-off-by: NLuciano Coelho <coelho@ti.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7e2ca17
    • L
      Merge tag '3.8-pci-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 1d854908
      Linus Torvalds 提交于
      Pull PCI updates from Bjorn Helgaas:
       "The most important is a fix for a pciehp deadlock that occurs when
        unplugging a Thunderbolt adapter.  We also applied the same fix to
        shpchp, removed CONFIG_EXPERIMENTAL dependencies, fixed a
        pcie_aspm=force problem, and fixed a refcount leak.
      
        Details:
      
         - Hotplug
            PCI: pciehp: Use per-slot workqueues to avoid deadlock
            PCI: shpchp: Make shpchp_wq non-ordered
            PCI: shpchp: Handle push button event asynchronously
            PCI: shpchp: Use per-slot workqueues to avoid deadlock
      
         - Power management
            PCI: Allow pcie_aspm=force even when FADT indicates it is unsupported
      
         - Misc
            PCI/AER: pci_get_domain_bus_and_slot() call missing required pci_dev_put()
            PCI: remove depends on CONFIG_EXPERIMENTAL"
      
      * tag '3.8-pci-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: remove depends on CONFIG_EXPERIMENTAL
        PCI: Allow pcie_aspm=force even when FADT indicates it is unsupported
        PCI: shpchp: Use per-slot workqueues to avoid deadlock
        PCI: shpchp: Handle push button event asynchronously
        PCI: shpchp: Make shpchp_wq non-ordered
        PCI/AER: pci_get_domain_bus_and_slot() call missing required pci_dev_put()
        PCI: pciehp: Use per-slot workqueues to avoid deadlock
      1d854908
    • T
      async: fix __lowest_in_progress() · f56c3196
      Tejun Heo 提交于
      Commit 083b804c ("async: use workqueue for worker pool") made it
      possible that async jobs are moved from pending to running out-of-order.
      While pending async jobs will be queued and dispatched for execution in
      the same order, nothing guarantees they'll enter "1) move self to the
      running queue" of async_run_entry_fn() in the same order.
      
      Before the conversion, async implemented its own worker pool.  An async
      worker, upon being woken up, fetches the first item from the pending
      list, which kept the executing lists sorted.  The conversion to
      workqueue was done by adding work_struct to each async_entry and async
      just schedules the work item.  The queueing and dispatching of such work
      items are still in order but now each worker thread is associated with a
      specific async_entry and moves that specific async_entry to the
      executing list.  So, depending on which worker reaches that point
      earlier, which is non-deterministic, we may end up moving an async_entry
      with larger cookie before one with smaller one.
      
      This broke __lowest_in_progress().  running->domain may not be properly
      sorted and is not guaranteed to contain lower cookies than pending list
      when not empty.  Fix it by ensuring sort-inserting to the running list
      and always looking at both pending and running when trying to determine
      the lowest cookie.
      
      Over time, the async synchronization implementation became quite messy.
      We better restructure it such that each async_entry is linked to two
      lists - one global and one per domain - and not move it when execution
      starts.  There's no reason to distinguish pending and running.  They
      behave the same for synchronization purposes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56c3196
    • L
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux · ed06ef31
      Linus Torvalds 提交于
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
       . revert 20b279 - require exclude_guest to use PEBS - kernel side, now
         older binaries will continue working for things like cycles:pp
         without needing to pass extra modifiers, from David Ahern.
      
       . Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI,
         from Sebastian Andrzej Siewior
      
      [ Pulling directly, Ingo would normally pull but has been unresponsive ]
      
      * tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Fix building from 'make perf-*-src-pkg' tarballs
        perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
      ed06ef31