1. 02 3月, 2017 1 次提交
    • P
      sched/fair: Make select_idle_cpu() more aggressive · 4c77b18c
      Peter Zijlstra 提交于
      Kitsunyan reported desktop latency issues on his Celeron 887 because
      of commit:
      
        1b568f0a ("sched/core: Optimize SCHED_SMT")
      
      ... even though his CPU doesn't do SMT.
      
      The effect of running the SMT code on a !SMT part is basically a more
      aggressive select_idle_cpu(). Removing the avg condition fixed things
      for him.
      
      I also know FB likes this test gone, even though other workloads like
      having it.
      
      For now, take it out by default, until we get a better idea.
      Reported-by: Nkitsunyan <kitsunyan@inbox.ru>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c77b18c
  2. 28 2月, 2017 9 次提交
  3. 25 2月, 2017 7 次提交
  4. 24 2月, 2017 7 次提交
  5. 23 2月, 2017 5 次提交
  6. 22 2月, 2017 2 次提交
    • L
      module: fix memory leak on early load_module() failures · a5544880
      Luis R. Rodriguez 提交于
      While looking for early possible module loading failures I was
      able to reproduce a memory leak possible with kmemleak. There
      are a few rare ways to trigger a failure:
      
        o we've run into a failure while processing kernel parameters
          (parse_args() returns an error)
        o mod_sysfs_setup() fails
        o we're a live patch module and copy_module_elf() fails
      
      Chances of running into this issue is really low.
      
      kmemleak splat:
      
      unreferenced object 0xffff9f2c4ada1b00 (size 32):
        comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
        hex dump (first 32 bytes):
          6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
          [<ffffffff8c1bc581>] kstrdup+0x31/0x60
          [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
          [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
          [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
          [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
          [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
          [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
          [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
          [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
          [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Cc: stable <stable@vger.kernel.org> # v2.6.30
      Fixes: e180a6b7 ("param: fix charp parameters set via sysfs")
      Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
      Reviewed-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Signed-off-by: NJessica Yu <jeyu@redhat.com>
      a5544880
    • M
      sched/core: Fix build paravirt build on arm and arm64 · fd4a61e0
      Mark Brown 提交于
      Commit 004172bd ("sched/core: Remove unnecessary #include headers")
      removed the inclusion of asm/paravirt.h which is used to get
      declarations of paravirt_steal_rq_enabled and paravirt_steal_clock.
      
      It is implicitly included on x86 but not on arm and arm64 breaking the
      build if paravirtualization is used.  Since things from that header are
      used directly fix the build by putting the direct inclusion back.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd4a61e0
  7. 20 2月, 2017 1 次提交
  8. 19 2月, 2017 1 次提交
    • S
      printk: use rcuidle console tracepoint · fc98c3c8
      Sergey Senozhatsky 提交于
      Use rcuidle console tracepoint because, apparently, it may be issued
      from an idle CPU:
      
        hw-breakpoint: Failed to enable monitor mode on CPU 0.
        hw-breakpoint: CPU 0 failed to disable vector catch
      
        ===============================
        [ ERR: suspicious RCU usage.  ]
        4.10.0-rc8-next-20170215+ #119 Not tainted
        -------------------------------
        ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        RCU used illegally from idle CPU!
        rcu_scheduler_active = 2, debug_locks = 0
        RCU used illegally from extended quiescent state!
        2 locks held by swapper/0/0:
         #0:  (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
         #1:  (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
      
        stack backtrace:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
        Hardware name: Generic OMAP4 (Flattened Device Tree)
          console_unlock
          vprintk_emit
          vprintk_default
          printk
          reset_ctrl_regs
          dbg_cpu_pm_notify
          notifier_call_chain
          cpu_pm_exit
          omap_enter_idle_coupled
          cpuidle_enter_state
          cpuidle_enter_state_coupled
          do_idle
          cpu_startup_entry
          start_kernel
      
      This RCU warning, however, is suppressed by lockdep_off() in printk().
      lockdep_off() increments the ->lockdep_recursion counter and thus
      disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
      lockdep to be enabled "current->lockdep_recursion == 0".
      
      Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: NTony Lindgren <tony@atomide.com>
      Tested-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk@armlinux.org.uk>
      Cc: <stable@vger.kernel.org> [3.4+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc98c3c8
  9. 18 2月, 2017 5 次提交
    • M
      hrtimer: Catch invalid clockids again · 336a9cde
      Marc Zyngier 提交于
      commit 82e88ff1 ("hrtimer: Revert CLOCK_MONOTONIC_RAW support") removed
      unfortunately a sanity check in the hrtimer code which was part of that
      MONOTONIC_RAW patch series.
      
      It would have caught the bogus usage of CLOCK_MONOTONIC_RAW in the wireless
      code. So bring it back.
      
      It is way too easy to take any random clockid and feed it to the hrtimer
      subsystem. At best, it gets mapped to a monotonic base, but it would be
      better to just catch illegal values as early as possible.
          
      Detect invalid clockids, map them to CLOCK_MONOTONIC and emit a warning.
      
      [ tglx: Replaced the BUG by a WARN and gracefully map to CLOCK_MONOTONIC ]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Tomasz Nowicki <tn@semihalf.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Link: http://lkml.kernel.org/r/1452879670-16133-3-git-send-email-marc.zyngier@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      336a9cde
    • G
      PM / sleep: Fix test_suspend after sleep state rework · 8cff6679
      Geert Uytterhoeven 提交于
      When passing "test_suspend=mem" to the kernel:
      
          PM: can't test 'mem' suspend state
      
      and the suspend test is not run.
      
      Commit 406e7938 ("PM / sleep: System sleep state selection
      interface rework") changed pm_labels[] from a contiguous NULL-terminated
      array to a sparse array (with the first element unpopulated), breaking
      the assumptions of the iterator in setup_test_suspend().
      
      Iterate from PM_SUSPEND_MIN to PM_SUSPEND_MAX - 1 to fix this.
      
      Fixes: 406e7938 (PM / sleep: System sleep state selection interface rework)
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8cff6679
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
    • D
      bpf: remove stubs for cBPF from arch code · 9383191d
      Daniel Borkmann 提交于
      Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
      that a single __weak function in the core that can be overridden
      similarly to the eBPF one. Also remove stale pr_err() mentions
      of bpf_jit_compile.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9383191d
    • D
      bpf: mark all registered map/prog types as __ro_after_init · c78f8bdf
      Daniel Borkmann 提交于
      All map types and prog types are registered to the BPF core through
      bpf_register_map_type() and bpf_register_prog_type() during init and
      remain unchanged thereafter. As by design we don't (and never will)
      have any pluggable code that can register to that at any later point
      in time, lets mark all the existing bpf_{map,prog}_type_list objects
      in the tree as __ro_after_init, so they can be moved to read-only
      section from then onwards.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c78f8bdf
  10. 17 2月, 2017 2 次提交