1. 08 4月, 2015 1 次提交
    • S
      tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values · 0c564a53
      Steven Rostedt (Red Hat) 提交于
      Several tracepoints use the helper functions __print_symbolic() or
      __print_flags() and pass in enums that do the mapping between the
      binary data stored and the value to print. This works well for reading
      the ASCII trace files, but when the data is read via userspace tools
      such as perf and trace-cmd, the conversion of the binary value to a
      human string format is lost if an enum is used, as userspace does not
      have access to what the ENUM is.
      
      For example, the tracepoint trace_tlb_flush() has:
      
       __print_symbolic(REC->reason,
          { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
          { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
          { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
          { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
      
      Which maps the enum values to the strings they represent. But perf and
      trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
      not be able to map it.
      
      With TRACE_DEFINE_ENUM(), developers can place these in the event header
      files and ftrace will convert the enums to their values:
      
      By adding:
      
       TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
       TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
       TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
       TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
      
       $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
      [...]
       __print_symbolic(REC->reason,
          { 0, "flush on task switch" },
          { 1, "remote shootdown" },
          { 2, "local shootdown" },
          { 3, "local mm shootdown" })
      
      The above is what userspace expects to see, and tools do not need to
      be modified to parse them.
      
      Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
      
      Cc: Guilherme Cox <cox@computer.org>
      Cc: Tony Luck <tony.luck@gmail.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0c564a53
  2. 03 4月, 2015 1 次提交
  3. 31 3月, 2015 1 次提交
  4. 25 3月, 2015 4 次提交
  5. 13 3月, 2015 1 次提交
    • A
      kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8
      Andrey Ryabinin 提交于
      Current approach in handling shadow memory for modules is broken.
      
      Shadow memory could be freed only after memory shadow corresponds it is no
      longer used.  vfree() called from interrupt context could use memory its
      freeing to store 'struct llist_node' in it:
      
          void vfree(const void *addr)
          {
          ...
              if (unlikely(in_interrupt())) {
                  struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
                  if (llist_add((struct llist_node *)addr, &p->list))
                          schedule_work(&p->wq);
      
      Later this list node used in free_work() which actually frees memory.
      Currently module_memfree() called in interrupt context will free shadow
      before freeing module's memory which could provoke kernel crash.
      
      So shadow memory should be freed after module's memory.  However, such
      deallocation order could race with kasan_module_alloc() in module_alloc().
      
      Free shadow right before releasing vm area.  At this point vfree()'d
      memory is not used anymore and yet not available for other allocations.
      New VM_KASAN flag used to indicate that vm area has dynamically allocated
      shadow memory so kasan frees shadow only if it was previously allocated.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5af5aa8
  6. 09 3月, 2015 3 次提交
    • S
      ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled · 524a3868
      Steven Rostedt (Red Hat) 提交于
      Some archs (specifically PowerPC), are sensitive with the ordering of
      the enabling of the calls to function tracing and setting of the
      function to use to be traced.
      
      That is, update_ftrace_function() sets what function the ftrace_caller
      trampoline should call. Some archs require this to be set before
      calling ftrace_run_update_code().
      
      Another bug was discovered, that ftrace_startup_sysctl() called
      ftrace_run_update_code() directly. If the function the ftrace_caller
      trampoline changes, then it will not be updated. Instead a call
      to ftrace_startup_enable() should be called because it tests to see
      if the callback changed since the code was disabled, and will
      tell the arch to update appropriately. Most archs do not need this
      notification, but PowerPC does.
      
      The problem could be seen by the following commands:
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo function > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # cat /sys/kernel/debug/tracing/trace
      
      The trace will show that function tracing was not active.
      
      Cc: stable@vger.kernel.org # 2.6.27+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      524a3868
    • P
      ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl · 1619dc3f
      Pratyush Anand 提交于
      When ftrace is enabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
      ftrace is disabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().
      
      Consider the following situation.
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
      
      After this ftrace_enabled = 0.
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
      called.
      
       # echo 1 > /proc/sys/kernel/ftrace_enabled
      
      Now ftrace_enabled will be set to true, but still
      ftrace_enable_ftrace_graph_caller() will not be called, which is not
      desired.
      
      Further if we execute the following after this:
        # echo nop > /sys/kernel/debug/tracing/current_tracer
      
      Now since ftrace_enabled is set it will call
      ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
      the ARM platform.
      
      On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
      it checks whether the old instruction is a nop or not. If it's not a nop,
      then it returns an error. If it is a nop then it replaces instruction at
      that address with a branch to ftrace_graph_caller.
      ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
      if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
      or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
      then it will return an error, which will cause the generic ftrace code to
      raise a warning.
      
      Note, x86 does not have an issue with this because the architecture
      specific code for ftrace_enable_ftrace_graph_caller() and
      ftrace_disable_ftrace_graph_caller() does not check the previous state,
      and calling either of these functions twice in a row has no ill effect.
      
      Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.com
      
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: NPratyush Anand <panand@redhat.com>
      [
        removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
        if CONFIG_FUNCTION_GRAPH_TRACER is not set.
      ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1619dc3f
    • S
      ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl · b24d443b
      Steven Rostedt (Red Hat) 提交于
      When /proc/sys/kernel/ftrace_enabled is set to zero, all function
      tracing is disabled. But the records that represent the functions
      still hold information about the ftrace_ops that are hooked to them.
      
      ftrace_ops may request "REGS" (have a full set of pt_regs passed to
      the callback), or "TRAMP" (the ops has its own trampoline to use).
      When the record is updated to represent the state of the ops hooked
      to it, it sets "REGS_EN" and/or "TRAMP_EN" to state that the callback
      points to the correct trampoline (REGS has its own trampoline).
      
      When ftrace_enabled is set to zero, all ftrace locations are a nop,
      so they do not point to any trampoline. But the _EN flags are still
      set. This can cause the accounting to go wrong when ftrace_enabled
      is cleared and an ops that has a trampoline is registered or unregistered.
      
      For example, the following will cause ftrace to crash:
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo nop > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      As function_graph uses a trampoline, when ftrace_enabled is set to zero
      the updates to the record are not done. When enabling function_graph
      again, the record will still have the TRAMP_EN flag set, and it will
      look for an op that has a trampoline other than the function_graph
      ops, and fail to find one.
      
      Cc: stable@vger.kernel.org # 3.17+
      Reported-by: NPratyush Anand <panand@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b24d443b
  7. 07 3月, 2015 1 次提交
  8. 06 3月, 2015 2 次提交
  9. 05 3月, 2015 2 次提交
    • T
      workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE · 8603e1b3
      Tejun Heo 提交于
      cancel[_delayed]_work_sync() are implemented using
      __cancel_work_timer() which grabs the PENDING bit using
      try_to_grab_pending() and then flushes the work item with PENDING set
      to prevent the on-going execution of the work item from requeueing
      itself.
      
      try_to_grab_pending() can always grab PENDING bit without blocking
      except when someone else is doing the above flushing during
      cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
      this case, __cancel_work_timer() currently invokes flush_work().  The
      assumption is that the completion of the work item is what the other
      canceling task would be waiting for too and thus waiting for the same
      condition and retrying should allow forward progress without excessive
      busy looping
      
      Unfortunately, this doesn't work if preemption is disabled or the
      latter task has real time priority.  Let's say task A just got woken
      up from flush_work() by the completion of the target work item.  If,
      before task A starts executing, task B gets scheduled and invokes
      __cancel_work_timer() on the same work item, its try_to_grab_pending()
      will return -ENOENT as the work item is still being canceled by task A
      and flush_work() will also immediately return false as the work item
      is no longer executing.  This puts task B in a busy loop possibly
      preventing task A from executing and clearing the canceling state on
      the work item leading to a hang.
      
      task A			task B			worker
      
      						executing work
      __cancel_work_timer()
        try_to_grab_pending()
        set work CANCELING
        flush_work()
          block for work completion
      						completion, wakes up A
      			__cancel_work_timer()
      			while (forever) {
      			  try_to_grab_pending()
      			    -ENOENT as work is being canceled
      			  flush_work()
      			    false as work is no longer executing
      			}
      
      This patch removes the possible hang by updating __cancel_work_timer()
      to explicitly wait for clearing of CANCELING rather than invoking
      flush_work() after try_to_grab_pending() fails with -ENOENT.
      
      Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
      
      v3: bit_waitqueue() can't be used for work items defined in vmalloc
          area.  Switched to custom wake function which matches the target
          work item and exclusive wait and wakeup.
      
      v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
          the target bit waitqueue has wait_bit_queue's on it.  Use
          DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
          Vizoso.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRabin Vincent <rabin.vincent@axis.com>
      Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
      Cc: stable@vger.kernel.org
      Tested-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Tested-by: NRabin Vincent <rabin.vincent@axis.com>
      8603e1b3
    • R
      genirq / PM: Add flag for shared NO_SUSPEND interrupt lines · 17f48034
      Rafael J. Wysocki 提交于
      It currently is required that all users of NO_SUSPEND interrupt
      lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
      WARN_ON_ONCE() in irq_pm_install_action() will trigger.  That is
      done to warn about situations in which unprepared interrupt handlers
      may be run unnecessarily for suspended devices and may attempt to
      access those devices by mistake.  However, it may cause drivers
      that have no technical reasons for using IRQF_NO_SUSPEND to set
      that flag just because they happen to share the interrupt line
      with something like a timer.
      
      Moreover, the generic handling of wakeup interrupts introduced by
      commit 9ce7a258 (genirq: Simplify wakeup mechanism) only works
      for IRQs without any NO_SUSPEND users, so the drivers of wakeup
      devices needing to use shared NO_SUSPEND interrupt lines for
      signaling system wakeup generally have to detect wakeup in their
      interrupt handlers.  Thus if they happen to share an interrupt line
      with a NO_SUSPEND user, they also need to request that their
      interrupt handlers be run after suspend_device_irqs().
      
      In both cases the reason for using IRQF_NO_SUSPEND is not because
      the driver in question has a genuine need to run its interrupt
      handler after suspend_device_irqs(), but because it happens to
      share the line with some other NO_SUSPEND user.  Otherwise, the
      driver would do without IRQF_NO_SUSPEND just fine.
      
      To make it possible to specify that condition explicitly, introduce
      a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
      that, when set, will indicate to the IRQ core that the interrupt
      user is generally fine with suspending the IRQ, but it also can
      tolerate handler invocations after suspend_device_irqs() and, in
      particular, it is capable of detecting system wakeup and triggering
      it as appropriate from its interrupt handler.
      
      That will allow us to work around a problem with a shared timer
      interrupt line on at91 platforms.
      
      Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
      Link: http://marc.info/?t=142252775300011&r=1&w=2
      Link: https://lkml.org/lkml/2014/12/15/552Reported-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      17f48034
  10. 03 3月, 2015 5 次提交
  11. 01 3月, 2015 3 次提交
  12. 23 2月, 2015 1 次提交
  13. 20 2月, 2015 8 次提交
    • C
      debug: prevent entering debug mode on panic/exception. · 5516fd7b
      Colin Cross 提交于
      On non-developer devices, kgdb prevents the device from rebooting
      after a panic.
      
      Incase of panics and exceptions, to allow the device to reboot, prevent
      entering debug mode to avoid getting stuck waiting for the user to
      interact with debugger.
      
      To avoid entering the debugger on panic/exception without any extra
      configuration, panic_timeout is being used which can be set via
      /proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
      default value.
      
      Setting panic_timeout indicates that the user requested machine to
      perform unattended reboot after panic. We dont want to get stuck waiting
      for the user input incase of panic.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: linux-kernel@vger.kernel.org
      Cc: Android Kernel Team <kernel-team@android.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: NColin Cross <ccross@android.com>
      [Kiran: Added context to commit message.
      panic_timeout is used instead of break_on_panic and
      break_on_exception to honor CONFIG_PANIC_TIMEOUT
      Modified the commit as per community feedback]
      Signed-off-by: NKiran Raparthy <kiran.kumar@linaro.org>
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      5516fd7b
    • D
      kdb: Const qualifier for kdb_getstr's prompt argument · 32d375f6
      Daniel Thompson 提交于
      All current callers of kdb_getstr() can pass constant pointers via the
      prompt argument. This patch adds a const qualification to make explicit
      the fact that this is safe.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      32d375f6
    • D
      kdb: Provide forward search at more prompt · fb6daa75
      Daniel Thompson 提交于
      Currently kdb allows the output of comamnds to be filtered using the
      | grep feature. This is useful but does not permit the output emitted
      shortly after a string match to be examined without wading through the
      entire unfiltered output of the command. Such a feature is particularly
      useful to navigate function traces because these traces often have a
      useful trigger string *before* the point of interest.
      
      This patch reuses the existing filtering logic to introduce a simple
      forward search to kdb that can be triggered from the more prompt.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      fb6daa75
    • D
      kdb: Fix a prompt management bug when using | grep · ab08e464
      Daniel Thompson 提交于
      Currently when the "| grep" feature is used to filter the output of a
      command then the prompt is not displayed for the subsequent command.
      Likewise any characters typed by the user are also not echoed to the
      display. This rather disconcerting problem eventually corrects itself
      when the user presses Enter and the kdb_grepping_flag is cleared as
      kdb_parse() tries to make sense of whatever they typed.
      
      This patch resolves the problem by moving the clearing of this flag
      from the middle of command processing to the beginning.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      ab08e464
    • D
      kdb: Remove stack dump when entering kgdb due to NMI · 54543881
      Daniel Thompson 提交于
      Issuing a stack dump feels ergonomically wrong when entering due to NMI.
      
      Entering due to NMI is normally a reaction to a user request, either the
      NMI button on a server or a "magic knock" on a UART. Therefore the
      backtrace behaviour on entry due to NMI should be like SysRq-g (no stack
      dump) rather than like oops.
      
      Note also that the stack dump does not offer any information that
      cannot be trivial retrieved using the 'bt' command.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      54543881
    • D
      kdb: Avoid printing KERN_ levels to consoles · f7d4ca8b
      Daniel Thompson 提交于
      Currently when kdb traps printk messages then the raw log level prefix
      (consisting of '\001' followed by a numeral) does not get stripped off
      before the message is issued to the various I/O handlers supported by
      kdb. This causes annoying visual noise as well as causing problems
      grepping for ^. It is also a change of behaviour compared to normal usage
      of printk() usage. For example <SysRq>-h ends up with different output to
      that of kdb's "sr h".
      
      This patch addresses the problem by stripping log levels from messages
      before they are issued to the I/O handlers. printk() which can also
      act as an i/o handler in some cases is special cased; if the caller
      provided a log level then the prefix will be preserved when sent to
      printk().
      
      The addition of non-printable characters to the output of kdb commands is a
      regression, albeit and extremely elderly one, introduced by commit
      04d2c8c8 ("printk: convert the format for KERN_<LEVEL> to a 2 byte
      pattern"). Note also that this patch does *not* restore the original
      behaviour from v3.5. Instead it makes printk() from within a kdb command
      display the message without any prefix (i.e. like printk() normally does).
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f7d4ca8b
    • J
      kdb: Fix off by one error in kdb_cpu() · df0036d1
      Jason Wessel 提交于
      There was a follow on replacement patch against the prior
      "kgdb: Timeout if secondary CPUs ignore the roundup".
      
      See: https://lkml.org/lkml/2015/1/7/442
      
      This patch is the delta vs the patch that was committed upstream:
        * Fix an off-by-one error in kdb_cpu().
        * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
          really want a static limit.
        * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
          (kgdb-next contains a patch which introduced pr_fmt() to this file
          to the tag will now be applied automatically).
      
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      df0036d1
    • J
      kdb: fix incorrect counts in KDB summary command output · 14675592
      Jay Lan 提交于
      The output of KDB 'summary' command should report MemTotal, MemFree
      and Buffers output in kB. Current codes report in unit of pages.
      
      A define of K(x) as
      is defined in the code, but not used.
      
      This patch would apply the define to convert the values to kB.
      Please include me on Cc on replies. I do not subscribe to linux-kernel.
      Signed-off-by: NJay Lan <jlan@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      14675592
  14. 18 2月, 2015 7 次提交