1. 09 5月, 2017 1 次提交
    • H
      crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE · 692f66f2
      Hari Bathini 提交于
      Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
      reuse crashkernel parameter for fadump", v4.
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      This patchset removes dependency with CONFIG_KEXEC for crashkernel
      parameter and vmcoreinfo related code as it can be reused without kexec
      support.  Also, crashkernel parameter is reused instead of
      fadump_reserve_mem to reserve memory for fadump.
      
      The first patch moves crashkernel parameter parsing and vmcoreinfo
      related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE.  The
      second patch reuses the definitions of append_elf_note() & final_note()
      functions under CONFIG_CRASH_CORE in IA64 arch code.  The third patch
      removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
      in powerpc.  The next patch reuses crashkernel parameter for reserving
      memory for fadump, instead of the fadump_reserve_mem parameter.  This
      has the advantage of using all syntaxes crashkernel parameter supports,
      for fadump as well.  The last patch updates fadump kernel documentation
      about use of crashkernel parameter.
      
      This patch (of 5):
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      But currently, code related to vmcoreinfo and parsing of crashkernel
      parameter is built under CONFIG_KEXEC_CORE.  This patch introduces
      CONFIG_CRASH_CORE and moves the above mentioned code under this config,
      allowing code reuse without dependency on CONFIG_KEXEC.  There is no
      functional change with this patch.
      
      Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.comSigned-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      692f66f2
  2. 19 4月, 2017 1 次提交
  3. 12 4月, 2017 3 次提交
  4. 04 4月, 2017 1 次提交
    • P
      printk: Correctly handle preemption in console_unlock() · 257ab443
      Petr Mladek 提交于
      Some console drivers code calls console_conditional_schedule()
      that looks at @console_may_schedule. The value must be cleared
      when the drivers are called from console_unlock() with
      interrupts disabled. But rescheduling is fine when the same
      code is called, for example, from tty operations where the
      console semaphore is taken via console_lock().
      
      This is why @console_may_schedule is cleared before calling console
      drivers. The original value is stored to decide if we could sleep
      between lines.
      
      Now, @console_may_schedule is not cleared when we call
      console_trylock() and jump back to the "again" goto label.
      This has become a problem, since the commit 6b97a20d
      ("printk: set may_schedule for some of console_trylock() callers").
      @console_may_schedule might get enabled now.
      
      There is also the opposite problem. console_lock() can be called
      only from preemptive context. It can always enable scheduling in
      the console code. But console_trylock() is not able to detect it
      when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
      original @console_may_schedule value after re-acquiring
      the console semaphore in console_unlock().
      
      This patch solves both problems by moving the "again" goto label.
      
      Alternative solution was to clear and restore the value around
      call_console_drivers(). Then console_conditional_schedule() could
      be used also inside console_unlock(). But there was a potential race
      with console_flush_on_panic() as reported by Sergey Senozhatsky.
      That function should be called only where there is only one CPU
      and with interrupts disabled. But better be on the safe side
      because stopping CPUs might fail.
      
      Fixes: 6b97a20d ("printk: set may_schedule for some of console_trylock() callers")
      Link: http://lkml.kernel.org/r/1490372045-22288-1-git-send-email-pmladek@suse.comSuggested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      257ab443
  5. 24 3月, 2017 1 次提交
    • S
      printk: use console_trylock() in console_cpu_notify() · 64ca752d
      Sergey Senozhatsky 提交于
      There is no need to always call blocking console_lock() in
      console_cpu_notify(), it's quite possible that console_sem can
      be locked by other CPU on the system, either already printing
      or soon to begin printing the messages. console_lock() in this
      case can simply block CPU hotplug for unknown period of time
      (console_unlock() is time unbound). Not that hotplug is very
      fast, but still, with other CPUs being online and doing
      printk() console_cpu_notify() can stuck.
      
      Use console_trylock() instead and opt-out if console_sem is
      already acquired from another CPU, since that CPU will do
      the printing for us.
      
      Link: http://lkml.kernel.org/r/20170121104729.8585-1-sergey.senozhatsky@gmail.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      64ca752d
  6. 02 3月, 2017 3 次提交
  7. 19 2月, 2017 1 次提交
    • S
      printk: use rcuidle console tracepoint · fc98c3c8
      Sergey Senozhatsky 提交于
      Use rcuidle console tracepoint because, apparently, it may be issued
      from an idle CPU:
      
        hw-breakpoint: Failed to enable monitor mode on CPU 0.
        hw-breakpoint: CPU 0 failed to disable vector catch
      
        ===============================
        [ ERR: suspicious RCU usage.  ]
        4.10.0-rc8-next-20170215+ #119 Not tainted
        -------------------------------
        ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        RCU used illegally from idle CPU!
        rcu_scheduler_active = 2, debug_locks = 0
        RCU used illegally from extended quiescent state!
        2 locks held by swapper/0/0:
         #0:  (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
         #1:  (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
      
        stack backtrace:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
        Hardware name: Generic OMAP4 (Flattened Device Tree)
          console_unlock
          vprintk_emit
          vprintk_default
          printk
          reset_ctrl_regs
          dbg_cpu_pm_notify
          notifier_call_chain
          cpu_pm_exit
          omap_enter_idle_coupled
          cpuidle_enter_state
          cpuidle_enter_state_coupled
          do_idle
          cpu_startup_entry
          start_kernel
      
      This RCU warning, however, is suppressed by lockdep_off() in printk().
      lockdep_off() increments the ->lockdep_recursion counter and thus
      disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
      lockdep to be enabled "current->lockdep_recursion == 0".
      
      Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: NTony Lindgren <tony@atomide.com>
      Tested-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk@armlinux.org.uk>
      Cc: <stable@vger.kernel.org> [3.4+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc98c3c8
  8. 08 2月, 2017 7 次提交
    • S
      printk: drop call_console_drivers() unused param · d9c23523
      Sergey Senozhatsky 提交于
      We do suppress_message_printing() check before we call
      call_console_drivers() now, so `level' param is not needed
      anymore.
      
      Link: http://lkml.kernel.org/r/20161224140902.1962-2-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      d9c23523
    • S
      printk: convert the rest to printk-safe · de6fcbdb
      Sergey Senozhatsky 提交于
      This patch converts the rest of logbuf users (which are
      out of printk recursion case, but can deadlock in printk).
      To make printk-safe usage easier the patch introduces 4
      helper macros:
      - logbuf_lock_irq()/logbuf_unlock_irq()
        lock/unlock the logbuf lock and disable/enable local IRQ
      
      - logbuf_lock_irqsave(flags)/logbuf_unlock_irqrestore(flags)
        lock/unlock the logbuf lock and saves/restores local IRQ state
      
      Link: http://lkml.kernel.org/r/20161227141611.940-9-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      de6fcbdb
    • S
      printk: remove zap_locks() function · 8b1742c9
      Sergey Senozhatsky 提交于
      We use printk-safe now which makes printk-recursion detection code
      in vprintk_emit() unreachable. The tricky thing here is that, apart
      from detecting and reporting printk recursions, that code also used
      to zap_locks() in case of panic() from the same CPU. However,
      zap_locks() does not look to be needed anymore:
      
      1) Since commit 08d78658 ("panic: release stale console lock to
         always get the logbuf printed out") panic flushing of `logbuf' to
         console ignores the state of `console_sem' by doing
         	panic()
      		console_trylock();
      		console_unlock();
      
      2) Since commit cf9b1106 ("printk/nmi: flush NMI messages on the
         system panic") panic attempts to zap the `logbuf_lock' spin_lock to
         successfully flush nmi messages to `logbuf'.
      
      Basically, it seems that we either already do what zap_locks() used to
      do but in other places or we ignore the state of the lock. The only
      reaming difference is that we don't re-init the console semaphore in
      printk_safe_flush_on_panic(), but this is not necessary because we
      don't call console drivers from printk_safe_flush_on_panic() due to
      the fact that we are using a deferred printk() version (as was
      suggested by Petr Mladek).
      
      Link: http://lkml.kernel.org/r/20161227141611.940-8-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      8b1742c9
    • S
      printk: use printk_safe buffers in printk · f975237b
      Sergey Senozhatsky 提交于
      Use printk_safe per-CPU buffers in printk recursion-prone blocks:
      -- around logbuf_lock protected sections in vprintk_emit() and
         console_unlock()
      -- around down_trylock_console_sem() and up_console_sem()
      
      Note that this solution addresses deadlocks caused by printk()
      recursive calls only. That is vprintk_emit() and console_unlock().
      The rest will be converted in a followup patch.
      
      Another thing to note is that we now keep lockdep enabled in printk,
      because we are protected against the printk recursion caused by
      lockdep in vprintk_emit() by the printk-safe mechanism - we first
      switch to per-CPU buffers and only then access the deadlock-prone
      locks.
      
      Examples:
      
      1) printk() from logbuf_lock spin_lock section
      
      Assume the following code:
        printk()
          raw_spin_lock(&logbuf_lock);
          WARN_ON(1);
          raw_spin_unlock(&logbuf_lock);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 366 at kernel/printk/printk.c:1811 vprintk_emit
       CPU: 0 PID: 366 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         vprintk_emit+0x1cd/0x438
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      2) printk() from semaphore sem->lock spin_lock section
      
      Assume the following code
      
        printk()
          console_trylock()
            down_trylock()
              raw_spin_lock_irqsave(&sem->lock, flags);
              WARN_ON(1);
              raw_spin_unlock_irqrestore(&sem->lock, flags);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 363 at kernel/locking/semaphore.c:141 down_trylock
       CPU: 1 PID: 363 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         down_trylock+0x3d/0x62
         ? vprintk_emit+0x3f9/0x414
         console_trylock+0x31/0xeb
         vprintk_emit+0x3f9/0x414
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      3) printk() from console_unlock()
      
      Assume the following code:
      
        printk()
          console_unlock()
            raw_spin_lock(&logbuf_lock);
            WARN_ON(1);
            raw_spin_unlock(&logbuf_lock);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 329 at kernel/printk/printk.c:2384 console_unlock
       CPU: 1 PID: 329 Comm: bash
       Call Trace:
         warn_slowpath_null+0x18/0x1a
         console_unlock+0x12d/0x559
         ? trace_hardirqs_on_caller+0x16d/0x189
         ? trace_hardirqs_on+0xd/0xf
         vprintk_emit+0x363/0x374
         vprintk_default+0x18/0x1a
         printk+0x43/0x4b
        [..]
      
      4) printk() from try_to_wake_up()
      
      Assume the following code:
      
        printk()
          console_unlock()
            up()
              try_to_wake_up()
                raw_spin_lock_irqsave(&p->pi_lock, flags);
                WARN_ON(1);
                raw_spin_unlock_irqrestore(&p->pi_lock, flags);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 3 PID: 363 at kernel/sched/core.c:2028 try_to_wake_up
       CPU: 3 PID: 363 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         try_to_wake_up+0x7f/0x4f7
         wake_up_process+0x15/0x17
         __up.isra.0+0x56/0x63
         up+0x32/0x42
         __up_console_sem+0x37/0x55
         console_unlock+0x21e/0x4c2
         vprintk_emit+0x41c/0x462
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      5) printk() from call_console_drivers()
      
      Assume the following code:
        printk()
          console_unlock()
            call_console_drivers()
            ...
                WARN_ON(1);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 305 at kernel/printk/printk.c:1604 call_console_drivers
       CPU: 2 PID: 305 Comm: bash
       Call Trace:
         warn_slowpath_null+0x18/0x1a
         call_console_drivers.isra.6.constprop.16+0x3a/0xb0
         console_unlock+0x471/0x48e
         vprintk_emit+0x1f4/0x206
         vprintk_default+0x18/0x1a
         vprintk_func+0x6e/0x70
         printk+0x3e/0x46
        [..]
      
      6) unsupported placeholder in printk() format now prints an actual
         warning from vscnprintf(), instead of
         	'BUG: recent printk recursion!'.
      
       ------------[ cut here ]------------
       WARNING: CPU: 5 PID: 337 at lib/vsprintf.c:1900 format_decode
       Please remove unsupported %
        in format string
       CPU: 5 PID: 337 Comm: bash
       Call Trace:
         dump_stack+0x4f/0x65
         __warn+0xc2/0xdd
         warn_slowpath_fmt+0x4b/0x53
         format_decode+0x22c/0x308
         vsnprintf+0x89/0x3b7
         vscnprintf+0xd/0x26
         vprintk_emit+0xb4/0x238
         vprintk_default+0x1d/0x1f
         vprintk_func+0x6c/0x73
         printk+0x43/0x4b
        [..]
      
      Link: http://lkml.kernel.org/r/20161227141611.940-7-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      f975237b
    • S
      printk: report lost messages in printk safe/nmi contexts · ddb9baa8
      Sergey Senozhatsky 提交于
      Account lost messages in pritk-safe and printk-safe-nmi
      contexts and report those numbers during printk_safe_flush().
      
      The patch also moves lost message counter to struct
      `printk_safe_seq_buf' instead of having dedicated static
      counters - this simplifies the code.
      
      Link: http://lkml.kernel.org/r/20161227141611.940-6-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      ddb9baa8
    • S
      printk: introduce per-cpu safe_print seq buffer · 099f1c84
      Sergey Senozhatsky 提交于
      This patch extends the idea of NMI per-cpu buffers to regions
      that may cause recursive printk() calls and possible deadlocks.
      Namely, printk() can't handle printk calls from schedule code
      or printk() calls from lock debugging code (spin_dump() for instance);
      because those may be called with `sem->lock' already taken or any
      other `critical' locks (p->pi_lock, etc.). An example of deadlock
      can be
      
       vprintk_emit()
        console_unlock()
         up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
          wake_up_process()
           try_to_wake_up()
            ttwu_queue()
             ttwu_activate()
              activate_task()
               enqueue_task()
                enqueue_task_fair()
                 cfs_rq_of()
                  task_of()
                   WARN_ON_ONCE(!entity_is_task(se))
                    vprintk_emit()
                     console_trylock()
                      down_trylock()
                       raw_spin_lock_irqsave(&sem->lock, flags)
                       ^^^^ deadlock
      
      and some other cases.
      
      Just like in NMI implementation, the solution uses a per-cpu
      `printk_func' pointer to 'redirect' printk() calls to a 'safe'
      callback, that store messages in a per-cpu buffer and flushes
      them back to logbuf buffer later.
      
      Usage example:
      
       printk()
        printk_safe_enter_irqsave(flags)
        //
        //  any printk() call from here will endup in vprintk_safe(),
        //  that stores messages in a special per-CPU buffer.
        //
        printk_safe_exit_irqrestore(flags)
      
      The 'redirection' mechanism, though, has been reworked, as suggested
      by Petr Mladek. Instead of using a per-cpu @print_func callback we now
      keep a per-cpu printk-context variable and call either default or nmi
      vprintk function depending on its value. printk_nmi_entrer/exit and
      printk_safe_enter/exit, thus, just set/celar corresponding bits in
      printk-context functions.
      
      The patch only adds printk_safe support, we don't use it yet.
      
      Link: http://lkml.kernel.org/r/20161227141611.940-4-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      099f1c84
    • S
      printk: use vprintk_func in vprintk() · bd66a892
      Sergey Senozhatsky 提交于
      vprintk(), just like printk(), better be using per-cpu printk_func
      instead of direct vprintk_emit() call. Just in case if vprintk()
      will ever be called from NMI, or from any other context that can
      deadlock in printk().
      
      Link: http://lkml.kernel.org/r/20161227141611.940-2-sergey.senozhatsky@gmail.com
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      bd66a892
  9. 25 12月, 2016 1 次提交
  10. 16 12月, 2016 1 次提交
  11. 15 12月, 2016 3 次提交
    • L
      printk: remove console flushing special cases for partial buffered lines · 5c2992ee
      Linus Torvalds 提交于
      It actively hurts proper merging, and makes for a lot of special cases.
      There was a good(ish) reason for doing it originally, but it's getting
      too painful to maintain.  And most of the original reasons for it are
      long gone.
      
      So instead of having special code to flush partial lines to the console
      (as opposed to the record buffers), do _all_ the console writing from
      the record buffer, and be done with it.
      
      If an oops happens (or some other synchronous event), we will flush the
      partial lines due to the oops printing activity, so this does not affect
      that.  It does mean that if you have a completely hung machine, a
      partial preceding line may not have been printed out.
      
      That was some of the original reason for this complexity, in fact, back
      when we used to test for the historical i386 "halt" instruction problem
      by doing
      
      	pr_info("Checking 'hlt' instruction... ");
      
      	if (!boot_cpu_data.hlt_works_ok) {
      		pr_cont("disabled\n");
      		return;
      	}
      	halt();
      	halt();
      	halt();
      	halt();
      	pr_cont("OK\n");
      
      and that model no longer works (it the 'hlt' instruction kills the
      machine, the partial line won't have been flushed, so you won't even see
      it).
      
      Of course, that was also back in the days when people actually had
      textual console output rather than a graphical splash-screen at bootup.
      How times change..
      
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Tested-by: NPetr Mladek <pmladek@suse.com>
      Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c2992ee
    • L
      printk: remove games with previous record flags · 5aa068ea
      Linus Torvalds 提交于
      The record logging code looks at the previous record flags in various
      ways, and they are all wrong.
      
      You can't use the previous record flags to determine anything about the
      next record, because they may simply not be related.  In particular, the
      reason the previous record was a continuation record may well be exactly
      _because_ the new record was printed by a different process, which is
      why the previous record was flushed.
      
      So all those games are simply wrong, and make the code hard to
      understand (because the code fundamentally cdoes not make sense).
      
      So remove it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5aa068ea
    • P
      kdb: call vkdb_printf() from vprintk_default() only when wanted · 34aaff40
      Petr Mladek 提交于
      kdb_trap_printk allows to pass normal printk() messages to kdb via
      vkdb_printk().  For example, it is used to get backtrace using the
      classic show_stack(), see kdb_show_stack().
      
      vkdb_printf() tries to avoid a potential infinite loop by disabling the
      trap.  But this approach is racy, for example:
      
      CPU1					CPU2
      
      vkdb_printf()
        // assume that kdb_trap_printk == 0
        saved_trap_printk = kdb_trap_printk;
        kdb_trap_printk = 0;
      
      					kdb_show_stack()
      					  kdb_trap_printk++;
      
      Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
      	  even when it should have been disabled. It will not
      	  cause a deadlock but...
      
         // using the outdated saved value: 0
         kdb_trap_printk = saved_trap_printk;
      
      					  kdb_trap_printk--;
      
      Problem2: Now, kdb_trap_printk == -1 and will stay like this.
         It means that all messages will get passed to kdb from
         now on.
      
      This patch removes the racy saved_trap_printk handling.  Instead, the
      recursion is prevented by a check for the locked CPU.
      
      The solution is still kind of racy.  A non-related printk(), from
      another process, might get trapped by vkdb_printf().  And the wanted
      printk() might not get trapped because kdb_printf_cpu is assigned.  But
      this problem existed even with the original code.
      
      A proper solution would be to get_cpu() before setting kdb_trap_printk
      and trap messages only from this CPU.  I am not sure if it is worth the
      effort, though.
      
      In fact, the race is very theoretical.  When kdb is running any of the
      commands that use kdb_trap_printk there is a single active CPU and the
      other CPUs should be in a holding pen inside kgdb_cpu_enter().
      
      The only time this is violated is when there is a timeout waiting for
      the other CPUs to report to the holding pen.
      
      Finally, note that the situation is a bit schizophrenic.  vkdb_printf()
      explicitly allows recursion but only from KDB code that calls
      kdb_printf() directly.  On the other hand, the generic printk()
      recursion is not allowed because it might cause an infinite loop.  This
      is why we could not hide the decision inside vkdb_printf() easily.
      
      Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.comSigned-off-by: NPetr Mladek <pmladek@suse.com>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34aaff40
  12. 06 12月, 2016 1 次提交
    • A
      [iov_iter] new primitives - copy_from_iter_full() and friends · cbbd26b8
      Al Viro 提交于
      copy_from_iter_full(), copy_from_iter_full_nocache() and
      csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
      et.al., advancing iterator only in case of successful full copy
      and returning whether it had been successful or not.
      
      Convert some obvious users.  *NOTE* - do not blindly assume that
      something is a good candidate for those unless you are sure that
      not advancing iov_iter in failure case is the right thing in
      this case.  Anything that does short read/short write kind of
      stuff (or is in a loop, etc.) is unlikely to be a good one.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      cbbd26b8
  13. 18 11月, 2016 1 次提交
  14. 15 11月, 2016 1 次提交
  15. 12 11月, 2016 1 次提交
    • H
      Revert "console: don't prefer first registered if DT specifies stdout-path" · c6c7d83b
      Hans de Goede 提交于
      This reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path").
      
      The reverted commit changes existing behavior on which many ARM boards
      rely.  Many ARM small-board-computers, like e.g.  the Raspberry Pi have
      both a video output and a serial console.  Depending on whether the user
      is using the device as a more regular computer; or as a headless device
      we need to have the console on either one or the other.
      
      Many users rely on the kernel behavior of the console being present on
      both outputs, before the reverted commit the console setup with no
      console= kernel arguments on an ARM board which sets stdout-path in dt
      would look like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
        tty0                 -WU (E  p  )    4:1
      
      Where as after the reverted commit, it looks like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
      
      This commit reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path") restoring the original
      behavior.
      
      Fixes: 05fd007e ("console: don't prefer first registered if DT specifies stdout-path")
      Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.comSigned-off-by: NHans de Goede <hdegoede@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6c7d83b
  16. 10 11月, 2016 1 次提交
  17. 20 10月, 2016 1 次提交
    • L
      printk: suppress empty continuation lines · 8835ca59
      Linus Torvalds 提交于
      We have a fairly common pattern where you print several things as
      continuations on one single line in a loop, and then at the end you do
      
      	printk(KERN_CONT "\n");
      
      to flush the buffered output.
      
      But if the output was flushed by something else (concurrent printk
      activity, or just system logging), we don't want that final flushing to
      just print an empty line.
      
      So just suppress empty continuation lines when they couldn't be merged
      into the line they are a continuation of.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8835ca59
  18. 10 10月, 2016 4 次提交
    • L
      printk: make reading the kernel log flush pending lines · bfd8d3f2
      Linus Torvalds 提交于
      That will mean that any possible subsequent continuation will now be
      broken up onto a line of its own (since reading the log has finalized
      the beginning og the line), but if user space has activated system
      logging (or if there's a kernel message dump going on) that is the right
      thing to do.
      
      And now that we actually get the continuation flags _right_ for this
      all, the user space logger that is reading the kernel messages can
      actually see the continuation marker.  Not that anybody seems to really
      bother with it (or care), but in theory user space can do its own
      message stitching.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bfd8d3f2
    • L
      printk: re-organize log_output() to be more legible · 5e467652
      Linus Torvalds 提交于
      Avoid some duplicate logic now that we can return early, and update the
      comments for the new LOG_CONT world order.
      
      This also stops the continuation flushing from just using random record
      flags for the flushing action, instead taking the flags from the proper
      original line and updating them as we add continuations to it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e467652
    • L
      printk: split out core logging code into helper function · c362c7ff
      Linus Torvalds 提交于
      The code that actually decides how to log the message (whether to put it
      directly into the record log, whether to append it to an existing
      buffered log, or whether to start a new buffered log) is fairly
      non-obvious code in the middle of the vprintk_emit() function.
      
      Splitting that code up into a helper function makes it easier to
      understand, but perhaps more importantly also allows for the code to
      just return early out of the helper function once it has made the
      decision about where the new log content goes.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c362c7ff
    • L
      printk: reinstate KERN_CONT for printing continuation lines · 4bcc595c
      Linus Torvalds 提交于
      Long long ago the kernel log buffer was a buffered stream of bytes, very
      much like stdio in user space.  It supported log levels by scanning the
      stream and noticing the log level markers at the beginning of each line,
      but if you wanted to print a partial line in multiple chunks, you just
      did multiple printk() calls, and it just automatically worked.
      
      Except when it didn't, and you had very confusing output when different
      lines got all mixed up with each other.  Then you got fragment lines
      mixing with each other, or with non-fragment lines, because it was
      traditionally impossible to tell whether a printk() call was a
      continuation or not.
      
      To at least help clarify the issue of continuation lines, we added a
      KERN_CONT marker back in 2007 to mark continuation lines:
      
        47492527 ("printk: add KERN_CONT annotation").
      
      That continuation marker was initially an empty string, and didn't
      actuall make any semantic difference.  But it at least made it possible
      to annotate the source code, and have check-patch notice that a printk()
      didn't need or want a log level marker, because it was a continuation of
      a previous line.
      
      To avoid the ambiguity between a continuation line that had that
      KERN_CONT marker, and a printk with no level information at all, we then
      in 2009 made KERN_CONT be a real log level marker which meant that we
      could now reliably tell the difference between the two cases.
      
        5fd29d6c ("printk: clean up handling of log-levels and newlines")
      
      and we could take advantage of that to make sure we didn't mix up
      continuation lines with lines that just didn't have any loglevel at all.
      
      Then, in 2012, the kernel log buffer was changed to be a "record" based
      log, where each line was a record that has a loglevel and a timestamp.
      
      You can see the beginning of that conversion in commits
      
        e11fea92 ("kmsg: export printk records to the /dev/kmsg interface")
        7ff9554b ("printk: convert byte-buffer to variable-length record buffer")
      
      with a number of follow-up commits to fix some painful fallout from that
      conversion.  Over all, it took a couple of months to sort out most of
      it.  But the upside was that you could have concurrent readers (and
      writers) of the kernel log and not have lines with mixed output in them.
      
      And one particular pain-point for the record-based kernel logging was
      exactly the fragmentary lines that are generated in smaller chunks.  In
      order to still log them as one recrod, the continuation lines need to be
      attached to the previous record properly.
      
      However the explicit continuation record marker that is actually useful
      for this exact case was actually removed in aroundm the same time by commit
      
        61e99ab8 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")
      
      due to the incorrect belief that KERN_CONT wasn't meaningful.  The
      ambiguity between "is this a continuation line" or "is this a plain
      printk with no log level information" was reintroduced, and in fact
      became an even bigger pain point because there was now the whole
      record-level merging of kernel messages going on.
      
      This patch reinstates the KERN_CONT as a real non-empty string marker,
      so that the ambiguity is fixed once again.
      
      But it's not a plain revert of that original removal: in the four years
      since we made KERN_CONT an empty string again, not only has the format
      of the log level markers changed, we've also had some usage changes in
      this area.
      
      For example, some ACPI code seems to use KERN_CONT _together_ with a log
      level, and now uses both the KERN_CONT marker and (for example) a
      KERN_INFO marker to show that it's an informational continuation of a
      line.
      
      Which is actually not a bad idea - if the continuation line cannot be
      attached to its predecessor, without the log level information we don't
      know what log level to assign to it (and we traditionally just assigned
      it the default loglevel).  So having both a log level and the KERN_CONT
      marker is not necessarily a bad idea, but it does mean that we need to
      actually iterate over potentially multiple markers, rather than just a
      single one.
      
      Also, since KERN_CONT was still conceptually needed, and encouraged, but
      didn't actually _do_ anything, we've also had the reverse problem:
      rather than having too many annotations it has too few, and there is bit
      rot with code that no longer marks the continuation lines with the
      KERN_CONT marker.
      
      So this patch not only re-instates the non-empty KERN_CONT marker, it
      also fixes up the cases of bit-rot I noticed in my own logs.
      
      There are probably other cases where KERN_CONT will be needed to be
      added, either because it is new code that never dealt with the need for
      KERN_CONT, or old code that has bitrotted without anybody noticing.
      
      That said, we should strive to avoid the need for KERN_CONT.  It does
      result in real problems for logging, and should generally not be seen as
      a good feature.  If we some day can get rid of the feature entirely,
      because nobody does any fragmented printk calls, that would be lovely.
      
      But until that point, let's at mark the code that relies on the hacky
      multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
      it also annotates code as "maybe this would be good to fix some day".
      
      (That said, particularly during single-threaded bootup, the downsides of
      KERN_CONT are very limited.  Things get much hairier when you have
      multiple threads going on and user level reading and writing logs too).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4bcc595c
  19. 08 10月, 2016 1 次提交
    • P
      console: don't prefer first registered if DT specifies stdout-path · 05fd007e
      Paul Burton 提交于
      If a device tree specifies a preferred device for kernel console output
      via the stdout-path or linux,stdout-path chosen node properties or the
      stdout alias then the kernel ought to honor it & output the kernel
      console to that device.  As it stands, this isn't the case.  Whilst we
      parse the stdout-path properties & set an of_stdout variable from
      of_alias_scan(), and use that from of_console_check() to determine
      whether to add a console device as a preferred console whilst
      registering it, we also prefer the first registered console if no other
      has been selected at the time of its registration.
      
      This means that if a console other than the one the device tree selects
      via stdout-path is registered first, we will switch to using it & when
      the stdout-path console is later registered the call to
      add_preferred_console() via of_console_check() is too late to do
      anything useful.  In practice this seems to mean that we switch to the
      dummy console device fairly early & see no further console output:
      
          Console: colour dummy device 80x25
          console [tty0] enabled
          bootconsole [ns16550a0] disabled
      
      Fix this by not automatically preferring the first registered console if
      one is specified by the device tree.  This allows consoles to be
      registered but not enabled, and once the driver for the console selected
      by stdout-path calls of_console_check() the driver will be added to the
      list of preferred consoles before any other console has been enabled.
      When that console is then registered via register_console() it will be
      enabled as expected.
      
      Link: http://lkml.kernel.org/r/20160809151937.26118-1-paul.burton@imgtec.comSigned-off-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Ivan Delalande <colona@arista.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05fd007e
  20. 10 8月, 2016 1 次提交
    • L
      Revert "printk: create pr_<level> functions" · a0cba217
      Linus Torvalds 提交于
      This reverts commit 874f9c7d.
      
      Geert Uytterhoeven reports:
       "This change seems to have an (unintendent?) side-effect.
      
        Before, pr_*() calls without a trailing newline characters would be
        printed with a newline character appended, both on the console and in
        the output of the dmesg command.
      
        After this commit, no new line character is appended, and the output
        of the next pr_*() call of the same type may be appended, like in:
      
          - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
          - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
          + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"
      
      Joe Perches says:
       "No, that is not intentional.
      
        The newline handling code inside vprintk_emit is a bit involved and
        for now I suggest a revert until this has all the same behavior as
        earlier"
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Requested-by: NJoe Perches <joe@perches.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0cba217
  21. 09 8月, 2016 1 次提交
  22. 03 8月, 2016 4 次提交
    • B
      printk: add kernel parameter to control writes to /dev/kmsg · 750afe7b
      Borislav Petkov 提交于
      Add a "printk.devkmsg" kernel command line parameter which controls how
      userspace writes into /dev/kmsg.  It has three options:
      
       * ratelimit - ratelimit logging from userspace.
       * on  - unlimited logging from userspace
       * off - logging from userspace gets ignored
      
      The default setting is to ratelimit the messages written to it.
      
      This changes the kernel default setting of "on" to "ratelimit" and we do
      that because we want to keep userspace spamming /dev/kmsg to sane
      levels.  This is especially moot when a small kernel log buffer wraps
      around and messages get lost.  So the ratelimiting setting should be a
      sane setting where kernel messages should have a bit higher chance of
      survival from all the spamming.
      
      It additionally does not limit logging to /dev/kmsg while the system is
      booting if we haven't disabled it on the command line.
      
      Furthermore, we can control the logging from a lower priority sysctl
      interface - kernel.printk_devkmsg.
      
      That interface will succeed only if printk.devkmsg *hasn't* been
      supplied on the command line.  If it has, then printk.devkmsg is a
      one-time setting which remains for the duration of the system lifetime.
      This "locking" of the setting is to prevent userspace from changing the
      logging on us through sysctl(2).
      
      This patch is based on previous patches from Linus and Steven.
      
      [bp@suse.de: fixes]
        Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
      Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Franck Bui <fbui@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      750afe7b
    • C
      printk: include <asm/sections.h> instead of <asm-generic/sections.h> · 40a7d9f5
      Christoph Hellwig 提交于
      asm-generic headers are generic implementations for architecture
      specific code and should not be included by common code.  Thus use the
      asm/ version of sections.h to get at the linker sections.
      
      Link: http://lkml.kernel.org/r/1468285008-7331-1-git-send-email-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40a7d9f5
    • S
      printk: introduce suppress_message_printing() · cf775444
      Sergey Senozhatsky 提交于
      Messages' levels and console log level are inspected when the actual
      printing occurs, which may provoke console_unlock() and
      console_cont_flush() to waste CPU cycles on every message that has
      loglevel above the current console_loglevel.
      
      Schematically, console_unlock() does the following:
      
      console_unlock()
      {
              ...
              for (;;) {
                      ...
                      raw_spin_lock_irqsave(&logbuf_lock, flags);
      skip:
                      msg = log_from_idx(console_idx);
      
                      if (msg->flags & LOG_NOCONS) {
                              ...
                              goto skip;
                      }
      
                      level = msg->level;
                      len += msg_print_text();                        >> sprintfs
                                                                         memcpy,
                                                                         etc.
      
                      if (nr_ext_console_drivers) {
                              ext_len = msg_print_ext_header();       >> scnprintf
                              ext_len += msg_print_ext_body();        >> scnprintfs
                                                                         etc.
                      }
                      ...
                      raw_spin_unlock(&logbuf_lock);
      
                      call_console_drivers(level, ext_text, ext_len, text, len)
                      {
                              if (level >= console_loglevel &&        >> drop the message
                                              !ignore_loglevel)
                                      return;
      
                              console->write(...);
                      }
      
                      local_irq_restore(flags);
              }
              ...
      }
      
      The thing here is this deferred `level >= console_loglevel' check.  We
      are wasting CPU cycles on sprintfs/memcpy/etc.  preparing the messages
      that we will eventually drop.
      
      This can be huge when we register a new CON_PRINTBUFFER console, for
      instance.  For every such a console register_console() resets the
      
              console_seq, console_idx, console_prev
      
      and sets a `exclusive console' pointer to replay the log buffer to that
      just-registered console.  And there can be a lot of messages to replay,
      in the worst case most of which can be dropped after console_loglevel
      test.
      
      We know messages' levels long before we call msg_print_text() and
      friends, so we can just move console_loglevel check out of
      call_console_drivers() and format a new message only if we are sure that
      it won't be dropped.
      
      The patch factors out loglevel check into suppress_message_printing()
      function and tests message->level and console_loglevel before formatting
      functions in console_unlock() and console_cont_flush() are getting
      executed.  This improves things not only for exclusive CON_PRINTBUFFER
      consoles, but for every console_unlock() that attempts to print a
      message of level above the console_loglevel.
      
      Link: http://lkml.kernel.org/r/20160627135012.8229-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NPetr Mladek <pmladek@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Calvin Owens <calvinowens@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf775444
    • J
      printk: create pr_<level> functions · 874f9c7d
      Joe Perches 提交于
      Using functions instead of macros can reduce overall code size by
      eliminating unnecessary "KERN_SOH<digit>" prefixes from format strings.
      
        defconfig x86-64:
      
        $ size vmlinux*
           text    data     bss      dec     hex  filename
        10193570 4331464 1105920 15630954  ee826a vmlinux.new
        10192623 4335560 1105920 15634103  ee8eb7 vmlinux.old
      
      As the return value are unimportant and unused in the kernel tree, these
      new functions return void.
      
      Miscellanea:
      
       - change pr_<level> macros to call new __pr_<level> functions
       - change vprintk_nmi and vprintk_default to add LOGLEVEL_<level> argument
      
      [akpm@linux-foundation.org: fix LOGLEVEL_INFO, per Joe]
      Link: http://lkml.kernel.org/r/e16cc34479dfefcae37c98b481e6646f0f69efc3.1466718827.git.joe@perches.comSigned-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      874f9c7d