1. 21 10月, 2017 1 次提交
  2. 04 10月, 2017 14 次提交
  3. 03 10月, 2017 2 次提交
    • P
      rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}() · f39b536c
      Paul E. McKenney 提交于
      The read of ->dynticks_nmi_nesting in rcu_irq_enter() and rcu_irq_exit()
      is currently protected with READ_ONCE().  However, this protection is
      unnecessary because (1) ->dynticks_nmi_nesting is updated only by the
      current CPU, (2) Although NMI handlers can update this field, they reset
      it back to its old value before return, and (3) Interrupts are disabled,
      so nothing else can modify it.  The value of ->dynticks_nmi_nesting is
      thus effectively constant, and so no protection is required.
      
      This commit therefore removes the READ_ONCE() protection from these
      two accesses.
      
      Link: http://lkml.kernel.org/r/20170926031902.GA2074@linux.vnet.ibm.comReported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      f39b536c
    • S
      ftrace: Fix kmemleak in unregister_ftrace_graph · 2b0b8499
      Shu Wang 提交于
      The trampoline allocated by function tracer was overwriten by function_graph
      tracer, and caused a memory leak. The save_global_trampoline should have
      saved the previous trampoline in register_ftrace_graph() and restored it in
      unregister_ftrace_graph(). But as it is implemented, save_global_trampoline was
      only used in unregister_ftrace_graph as default value 0, and it overwrote the
      previous trampoline's value. Causing the previous allocated trampoline to be
      lost.
      
      kmmeleak backtrace:
          kmemleak_vmalloc+0x77/0xc0
          __vmalloc_node_range+0x1b5/0x2c0
          module_alloc+0x7c/0xd0
          arch_ftrace_update_trampoline+0xb5/0x290
          ftrace_startup+0x78/0x210
          register_ftrace_function+0x8b/0xd0
          function_trace_init+0x4f/0x80
          tracing_set_tracer+0xe6/0x170
          tracing_set_trace_write+0x90/0xd0
          __vfs_write+0x37/0x170
          vfs_write+0xb2/0x1b0
          SyS_write+0x55/0xc0
          do_syscall_64+0x67/0x180
          return_from_SYSCALL_64+0x0/0x6a
      
      [
        Looking further into this, I found that this was left over from when the
        function and function graph tracers shared the same ftrace_ops. But in
        commit 5f151b24 ("ftrace: Fix function_profiler and function tracer
        together"), the two were separated, and the save_global_trampoline no
        longer was necessary (and it may have been broken back then too).
        -- Steven Rostedt
      ]
      
      Link: http://lkml.kernel.org/r/20170912021454.5976-1-shuwang@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: 5f151b24 ("ftrace: Fix function_profiler and function tracer together")
      Signed-off-by: NShu Wang <shuwang@redhat.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2b0b8499
  4. 30 9月, 2017 1 次提交
    • A
      fix infoleak in waitid(2) · 6c85501f
      Al Viro 提交于
      kernel_waitid() can return a PID, an error or 0.  rusage is filled in the first
      case and waitid(2) rusage should've been copied out exactly in that case, *not*
      whenever kernel_waitid() has not returned an error.  Compat variant shares that
      braino; none of kernel_wait4() callers do, so the below ought to fix it.
      Reported-and-tested-by: NAlexander Potapenko <glider@google.com>
      Fixes: ce72a16f ("wait4(2)/waitid(2): separate copying rusage to userland")
      Cc: stable@vger.kernel.org # v4.13
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6c85501f
  5. 29 9月, 2017 7 次提交
    • E
      sched/sysctl: Check user input value of sysctl_sched_time_avg · 5ccba44b
      Ethan Zhao 提交于
      System will hang if user set sysctl_sched_time_avg to 0:
      
        [root@XXX ~]# sysctl kernel.sched_time_avg_ms=0
      
        Stack traceback for pid 0
        0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
        ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
        0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
        ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
        Call Trace:
        <IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
        [<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
        [<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
        [<ffffffff810c5b84>] ? load_balance+0x194/0x900
        [<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
        [<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
        [<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
        [<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
        [<ffffffff8108ac85>] ? irq_exit+0x125/0x130
        [<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
        [<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
        [<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
         <EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
        [<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
        [<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
        [<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
        [<ffffffff81053373>] ? start_secondary+0x173/0x1e0
      
      Because divide-by-zero error happens in function:
      
      update_group_capacity()
        update_cpu_capacity()
          scale_rt_capacity()
           {
                ...
                total = sched_avg_period() + delta;
                used = div_u64(avg, total);
                ...
           }
      
      To fix this issue, check user input value of sysctl_sched_time_avg, keep
      it unchanged when hitting invalid input, and set the minimum limit of
      sysctl_sched_time_avg to 1 ms.
      Reported-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com>
      Signed-off-by: NEthan Zhao <ethan.zhao@oracle.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: efault@gmx.de
      Cc: ethan.kernel@gmail.com
      Cc: keescook@chromium.org
      Cc: mcgrof@kernel.org
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ccba44b
    • P
      sched/debug: Ignore TASK_IDLE for SysRq-W · 5d68cc95
      Peter Zijlstra 提交于
      Markus reported that tasks in TASK_IDLE state are reported by SysRq-W,
      which results in undesirable clutter.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5d68cc95
    • P
      sched/tracing: Use common task-state helpers · 5f6ad26e
      Peter Zijlstra 提交于
      Remove yet another task-state char instance.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5f6ad26e
    • P
      locking/rwsem-xadd: Fix missed wakeup due to reordering of load · 9c29c318
      Prateek Sood 提交于
      If a spinner is present, there is a chance that the load of
      rwsem_has_spinner() in rwsem_wake() can be reordered with
      respect to decrement of rwsem count in __up_write() leading
      to wakeup being missed:
      
       spinning writer                  up_write caller
       ---------------                  -----------------------
       [S] osq_unlock()                 [L] osq
        spin_lock(wait_lock)
        sem->count=0xFFFFFFFF00000001
                  +0xFFFFFFFF00000000
        count=sem->count
        MB
                                         sem->count=0xFFFFFFFE00000001
                                                   -0xFFFFFFFF00000001
                                         spin_trylock(wait_lock)
                                         return
       rwsem_try_write_lock(count)
       spin_unlock(wait_lock)
       schedule()
      
      Reordering of atomic_long_sub_return_release() in __up_write()
      and rwsem_has_spinner() in rwsem_wake() can cause missing of
      wakeup in up_write() context. In spinning writer, sem->count
      and local variable count is 0XFFFFFFFE00000001. It would result
      in rwsem_try_write_lock() failing to acquire rwsem and spinning
      writer going to sleep in rwsem_down_write_failed().
      
      The smp_rmb() will make sure that the spinner state is
      consulted after sem->count is updated in up_write context.
      Signed-off-by: NPrateek Sood <prsood@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@stgolabs.net
      Cc: longman@redhat.com
      Cc: parri.andrea@gmail.com
      Cc: sramana@codeaurora.org
      Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9c29c318
    • P
      sched/debug: Remove unused variable · 65d5dc47
      Peter Zijlstra 提交于
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      65d5dc47
    • A
      perf/aux: Only update ->aux_wakeup in non-overwrite mode · 441430eb
      Alexander Shishkin 提交于
      The following commit:
      
        d9a50b02 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")
      
      changed the AUX wakeup position calculation to rounddown(), which causes
      a division-by-zero in AUX overwrite mode (aka "snapshot mode").
      
      The zero denominator results from the fact that perf record doesn't set
      aux_watermark to anything, in which case the kernel will set it to half
      the AUX buffer size, but only for non-overwrite mode. In the overwrite
      mode aux_watermark stays zero.
      
      The good news is that, AUX overwrite mode, wakeups don't happen and
      related bookkeeping is not relevant, so we can simply forego the whole
      wakeup updates.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      441430eb
    • R
      PM / s2idle: Invoke the ->wake() platform callback earlier · 87cbde8d
      Rafael J. Wysocki 提交于
      The role of the ->wake() platform callback for suspend-to-idle is to
      deal with possible spurious wakeups, among other things.  The ACPI
      implementation of it, acpi_s2idle_wake(), additionally checks the
      conditions for entering the Low Power S0 Idle state by the platform
      and reports the ones that have not been met.
      
      However, the ->wake() platform callback is invoked after calling
      dpm_noirq_resume_devices(), which means that the power states of some
      devices may have changed since s2idle_enter() returned, so some unmet
      Low Power S0 Idle conditions may be reported incorrectly as a result
      of that.
      
      To avoid these false positives, reorder the invocations of the
      dpm_noirq_resume_devices() routine and the ->wake() platform callback
      in s2idle_loop().
      
      Fixes: 726fb6b4 (ACPI / PM: Check low power idle constraints for debug only)
      Tested-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      87cbde8d
  6. 28 9月, 2017 3 次提交
  7. 26 9月, 2017 8 次提交
  8. 25 9月, 2017 3 次提交
  9. 24 9月, 2017 1 次提交