• D
    kernel/hung_task.c: allow to set checking interval separately from timeout · a2e51445
    Dmitry Vyukov 提交于
    Currently task hung checking interval is equal to timeout, as the result
    hung is detected anywhere between timeout and 2*timeout.  This is fine for
    most interactive environments, but this hurts automated testing setups
    (syzbot).  In an automated setup we need to strictly order CPU lockup <
    RCU stall < workqueue lockup < task hung < silent loss, so that RCU stall
    is not detected as task hung and task hung is not detected as silent
    machine loss.  The large variance in task hung detection timeout requires
    setting silent machine loss timeout to a very large value (e.g.  if task
    hung is 3 mins, then silent loss need to be set to ~7 mins).  The
    additional 3 minutes significantly reduce testing efficiency because
    usually we crash kernel within a minute, and this can add hours to bug
    localization process as it needs to do dozens of tests.
    
    Allow setting checking interval separately from timeout.  This allows to
    set timeout to, say, 3 minutes, but checking interval to 10 secs.
    
    The interval is controlled via a new hung_task_check_interval_secs sysctl,
    similar to the existing hung_task_timeout_secs sysctl.  The default value
    of 0 results in the current behavior: checking interval is equal to
    timeout.
    
    [akpm@linux-foundation.org: update hung_task_timeout_max's comment]
    Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.comSigned-off-by: NDmitry Vyukov <dvyukov@google.com>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    a2e51445
sysctl.h 2.5 KB