1. 21 2月, 2018 1 次提交
  2. 07 11月, 2017 1 次提交
    • Z
      kprobes, x86/alternatives: Use text_mutex to protect smp_alt_modules · e846d139
      Zhou Chengming 提交于
      We use alternatives_text_reserved() to check if the address is in
      the fixed pieces of alternative reserved, but the problem is that
      we don't hold the smp_alt mutex when call this function. So the list
      traversal may encounter a deleted list_head if another path is doing
      alternatives_smp_module_del().
      
      One solution is that we can hold smp_alt mutex before call this
      function, but the difficult point is that the callers of this
      functions, arch_prepare_kprobe() and arch_prepare_optimized_kprobe(),
      are called inside the text_mutex. So we must hold smp_alt mutex
      before we go into these arch dependent code. But we can't now,
      the smp_alt mutex is the arch dependent part, only x86 has it.
      Maybe we can export another arch dependent callback to solve this.
      
      But there is a simpler way to handle this problem. We can reuse the
      text_mutex to protect smp_alt_modules instead of using another mutex.
      And all the arch dependent checks of kprobes are inside the text_mutex,
      so it's safe now.
      Signed-off-by: NZhou Chengming <zhouchengming1@huawei.com>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@suse.de
      Fixes: 2cfa1978 "ftrace/alternatives: Introducing *_text_reserved functions"
      Link: http://lkml.kernel.org/r/1509585501-79466-1-git-send-email-zhouchengming1@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e846d139
  3. 24 9月, 2017 2 次提交
    • S
      extable: Enable RCU if it is not watching in kernel_text_address() · e8cac8b1
      Steven Rostedt (VMware) 提交于
      If kernel_text_address() is called when RCU is not watching, it can cause an
      RCU bug because is_module_text_address(), the is_kprobe_*insn_slot()
      and is_bpf_text_address() functions require the use of RCU.
      
      Only enable RCU if it is not currently watching before it calls
      is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
      because kernel_text_address() can happen pretty much anywhere (like an NMI),
      and even from within an NMI. It is called via save_stack_trace() that can be
      called by any WARN() or tracing function, which can happen while RCU is not
      watching (for example, going to or coming from idle, or during CPU take down
      or bring up).
      
      Cc: stable@vger.kernel.org
      Fixes: 0be964be ("module: Sanitize RCU usage and locking")
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e8cac8b1
    • S
      extable: Consolidate *kernel_text_address() functions · 9aadde91
      Steven Rostedt (VMware) 提交于
      The functionality between kernel_text_address() and _kernel_text_address()
      is the same except that _kernel_text_address() does a little more (that
      function needs a rename, but that can be done another time). Instead of
      having duplicate code in both, simply have _kernel_text_address() calls
      kernel_text_address() instead.
      
      This is marked for stable because there's an RCU bug that can happen if
      one of these functions gets called while RCU is not watching. That fix
      depends on this fix to keep from having to write the fix twice.
      
      Cc: stable@vger.kernel.org
      Fixes: 0be964be ("module: Sanitize RCU usage and locking")
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      9aadde91
  4. 11 7月, 2017 1 次提交
  5. 07 7月, 2017 1 次提交
  6. 23 5月, 2017 1 次提交
  7. 18 2月, 2017 1 次提交
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
  8. 10 2月, 2017 1 次提交
    • P
      core: migrate exception table users off module.h and onto extable.h · 8a293be0
      Paul Gortmaker 提交于
      These files were including module.h for exception table related
      functions.  We've now separated that content out into its own file
      "extable.h" so now move over to that and where possible, avoid all
      the extra header content in module.h that we don't really need to
      compile these non-modular files.
      
      Note:
         init/main.c still needs module.h for __init_or_module
         kernel/extable.c still needs module.h for is_module_text_address
      
      ...and so we don't get the benefit of removing module.h from the cpp
      feed for these two files, unlike the almost universal 1:1 exchange
      of module.h for extable.h we were able to do in the arch dirs.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NJessica Yu <jeyu@redhat.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      8a293be0
  9. 14 1月, 2017 1 次提交
    • M
      kprobes, extable: Identify kprobes trampolines as kernel text area · 5b485629
      Masami Hiramatsu 提交于
      Improve __kernel_text_address()/kernel_text_address() to return
      true if the given address is on a kprobe's instruction slot
      trampoline.
      
      This can help stacktraces to determine the address is on a
      text area or not.
      
      To implement this atomically in is_kprobe_*_slot(), also change
      the insn_cache page list to an RCU list.
      
      This changes timings a bit (it delays page freeing to the RCU garbage
      collection phase), but none of that is in the hot path.
      
      Note: this change can add small overhead to stack unwinders because
      it adds 2 additional checks to __kernel_text_address(). However, the
      impact should be very small, because kprobe_insn_pages list has 1 entry
      per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
      and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
      In most use cases, the number of kprobe events may be less
      than 20, which means that is_kprobe_*_slot() will check just one entry.
      Tested-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/148388747896.6869.6354262871751682264.stgit@devbox
      [ Improved the changelog and coding style. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5b485629
  10. 25 12月, 2016 1 次提交
  11. 11 9月, 2015 1 次提交
  12. 20 11月, 2014 1 次提交
    • S
      ftrace/x86/extable: Add is_ftrace_trampoline() function · aec0be2d
      Steven Rostedt (Red Hat) 提交于
      Stack traces that happen from function tracing check if the address
      on the stack is a __kernel_text_address(). That is, is the address
      kernel code. This calls core_kernel_text() which returns true
      if the address is part of the builtin kernel code. It also calls
      is_module_text_address() which returns true if the address belongs
      to module code.
      
      But what is missing is ftrace dynamically allocated trampolines.
      These trampolines are allocated for individual ftrace_ops that
      call the ftrace_ops callback functions directly. But if they do a
      stack trace, the code checking the stack wont detect them as they
      are neither core kernel code nor module address space.
      
      Adding another field to ftrace_ops that also stores the size of
      the trampoline assigned to it we can create a new function called
      is_ftrace_trampoline() that returns true if the address is a
      dynamically allocate ftrace trampoline. Note, it ignores trampolines
      that are not dynamically allocated as they will return true with
      the core_kernel_text() function.
      
      Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      aec0be2d
  13. 14 2月, 2014 1 次提交
  14. 29 11月, 2013 1 次提交
  15. 12 9月, 2013 1 次提交
  16. 15 4月, 2013 1 次提交
  17. 20 4月, 2012 1 次提交
  18. 20 5月, 2011 2 次提交
    • S
      extable, core_kernel_data(): Make sure all archs define _sdata · a2d063ac
      Steven Rostedt 提交于
      A new utility function (core_kernel_data()) is used to determine if a
      passed in address is part of core kernel data or not. It may or may not
      return true for RO data, but this utility must work for RW data.
      
      Thus both _sdata and _edata must be defined and continuous,
      without .init sections that may later be freed and replaced by
      volatile memory (memory that can be freed).
      
      This utility function is used to determine if data is safe from
      ever being freed. Thus it should return true for all RW global
      data that is not in a module or has been allocated, or false
      otherwise.
      
      Also change core_kernel_data() back to the more precise _sdata condition
      and document the function.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NHirokazu Takata <takata@linux-m32r.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: JamesE.J.Bottomley <jejb@parisc-linux.org>
      Link: http://lkml.kernel.org/r/1305855298.1465.19.camel@gandalf.stny.rr.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ----
       arch/alpha/kernel/vmlinux.lds.S   |    1 +
       arch/m32r/kernel/vmlinux.lds.S    |    1 +
       arch/m68k/kernel/vmlinux-std.lds  |    2 ++
       arch/m68k/kernel/vmlinux-sun3.lds |    1 +
       arch/mips/kernel/vmlinux.lds.S    |    1 +
       arch/parisc/kernel/vmlinux.lds.S  |    3 +++
       kernel/extable.c                  |   12 +++++++++++-
       7 files changed, 20 insertions(+), 1 deletion(-)
      a2d063ac
    • I
      core_kernel_data(): Fix architectures that do not define _sdata · c5fc4721
      Ingo Molnar 提交于
      Some architectures such as Alpha do not define _sdata but _data:
      
        kernel/built-in.o: In function `core_kernel_data':
        kernel/extable.c:77: undefined reference to `_sdata'
      
      So expand the scope of the data range to the text addresses too,
      this might be more correct anyway because this way we can
      cover readonly variables as well.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-i878c8a0e0g0ep4v7i6vxnhz@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5fc4721
  19. 19 5月, 2011 1 次提交
    • S
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt 提交于
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
  20. 31 3月, 2009 1 次提交
    • R
      module: remove module_text_address() · a6e6abd5
      Rusty Russell 提交于
      Impact: Replace and remove risky (non-EXPORTed) API
      
      module_text_address() returns a pointer to the module, which given locking
      improvements in module.c, is useless except to test for NULL:
      
      1) If the module can't go away, use __module_text_address.
      2) Otherwise, just use is_module_text_address().
      
      Cc: linux-mtd@lists.infradead.org
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      a6e6abd5
  21. 23 3月, 2009 1 次提交
  22. 20 3月, 2009 1 次提交
    • I
      tracing, Text Edit Lock - kprobes architecture independent support, nommu fix · 505f2b97
      Ingo Molnar 提交于
      Impact: build fix on SH !CONFIG_MMU
      
      Stephen Rothwell reported this linux-next build failure on the SH
      architecture:
      
        kernel/built-in.o: In function `disable_all_kprobes':
        kernel/kprobes.c:1382: undefined reference to `text_mutex'
        [...]
      
      And observed:
      
      | Introduced by commit 4460fdad ("tracing,
      | Text Edit Lock - kprobes architecture independent support") from the
      | tracing tree.  text_mutex is defined in mm/memory.c which is only built
      | if CONFIG_MMU is defined, which is not true for sh allmodconfig.
      
      Move this lock to kernel/extable.c (which is already home to various
      kernel text related routines), which file is always built-in.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      LKML-Reference: <20090320110602.86351a91.sfr@canb.auug.org.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      505f2b97
  23. 19 3月, 2009 1 次提交
    • I
      symbols, stacktrace: look up init symbols after module symbols · 4a44bac1
      Ingo Molnar 提交于
      Impact: fix incomplete stacktraces
      
      I noticed such weird stacktrace entries in lockdep dumps:
      
      [    0.285956] {HARDIRQ-ON-W} state was registered at:
      [    0.285956]   [<ffffffff802bce90>] mark_irqflags+0xbe/0x125
      [    0.285956]   [<ffffffff802bf2fd>] __lock_acquire+0x674/0x82d
      [    0.285956]   [<ffffffff802bf5b2>] lock_acquire+0xfc/0x128
      [    0.285956]   [<ffffffff8135b636>] rt_spin_lock+0xc8/0xd0
      [    0.285956]   [<ffffffffffffffff>] 0xffffffffffffffff
      
      The stacktrace entry is cut off after rt_spin_lock.
      
      After much debugging i found out that stacktrace entries that
      belong to init symbols dont get printed out, due to commit:
      
        a2da4052: module: Don't report discarded init pages as kernel text.
      
      The reason is this check added to core_kernel_text():
      
      -       if (addr >= (unsigned long)_sinittext &&
      +       if (system_state == SYSTEM_BOOTING &&
      +           addr >= (unsigned long)_sinittext &&
                  addr <= (unsigned long)_einittext)
                      return 1;
      
      This will discard inittext symbols even though their symbol table
      is still present and even though stacktraces done while the system
      was booting up might still be relevant.
      
      To not reintroduce the (not well-specified) bug addressed in that
      commit, first do a module symbols lookup, then a final init-symbols
      lookup.
      
      This will work fine on architectures that have separate address
      spaces for modules (such as x86) - and should not crash any other
      architectures either.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <new-discussion>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4a44bac1
  24. 09 2月, 2009 1 次提交
  25. 08 12月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions · 8b96f011
      Frederic Weisbecker 提交于
      Impact: trace more functions
      
      When the function graph tracer is configured, three more files are not
      traced to prevent only four functions to be traced. And this impacts the
      normal function tracer too.
      
      arch/x86/kernel/process_64/32.c:
      
      I had crashes when I let this file traced. After some debugging, I saw
      that the "current" task point was changed inside__swtich_to(), ie:
      "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
      the original return address of the function inside current, we had
      crashes. Only __switch_to() has to be excluded from tracing.
      
      kernel/module.c and kernel/extable.c:
      
      Because of a function used internally by the function graph tracer:
      __kernel_text_address()
      
      To let the other functions inside these files to be traced, this patch
      introduces the __notrace_funcgraph function prefix which is __notrace if
      function graph tracer is configured and nothing if not.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b96f011
  26. 10 9月, 2008 1 次提交
  27. 29 1月, 2008 1 次提交
  28. 16 5月, 2006 1 次提交
  29. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4