1. 08 6月, 2017 10 次提交
  2. 24 5月, 2017 1 次提交
  3. 23 5月, 2017 13 次提交
  4. 19 5月, 2017 2 次提交
  5. 18 5月, 2017 7 次提交
    • D
      bpf: adjust verifier heuristics · 3c2ce60b
      Daniel Borkmann 提交于
      Current limits with regards to processing program paths do not
      really reflect today's needs anymore due to programs becoming
      more complex and verifier smarter, keeping track of more data
      such as const ALU operations, alignment tracking, spilling of
      PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for
      smarter matching of what LLVM generates.
      
      This also comes with the side-effect that we result in fewer
      opportunities to prune search states and thus often need to do
      more work to prove safety than in the past due to different
      register states and stack layout where we mismatch. Generally,
      it's quite hard to determine what caused a sudden increase in
      complexity, it could be caused by something as trivial as a
      single branch somewhere at the beginning of the program where
      LLVM assigned a stack slot that is marked differently throughout
      other branches and thus causing a mismatch, where verifier
      then needs to prove safety for the whole rest of the program.
      Subsequently, programs with even less than half the insn size
      limit can get rejected. We noticed that while some programs
      load fine under pre 4.11, they get rejected due to hitting
      limits on more recent kernels. We saw that in the vast majority
      of cases (90+%) pruning failed due to register mismatches. In
      case of stack mismatches, majority of cases failed due to
      different stack slot types (invalid, spill, misc) rather than
      differences in spilled registers.
      
      This patch makes pruning more aggressive by also adding markers
      that sit at conditional jumps as well. Currently, we only mark
      jump targets for pruning. For example in direct packet access,
      these are usually error paths where we bail out. We found that
      adding these markers, it can reduce number of processed insns
      by up to 30%. Another option is to ignore reg->id in probing
      PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning
      slightly as well by up to 7% observed complexity reduction as
      stand-alone. Meaning, if a previous path with register type
      PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then
      in the current state a PTR_TO_MAP_VALUE_OR_NULL register for
      the same map X must be safe as well. Last but not least the
      patch also adds a scheduling point and bumps the current limit
      for instructions to be processed to a more adequate value.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c2ce60b
    • S
      kprobes: Document how optimized kprobes are removed from module unload · 545a0281
      Steven Rostedt (VMware) 提交于
      Thomas discovered a bug where the kprobe trace tests had a race
      condition where the kprobe_optimizer called from a delayed work queue
      that does the optimizing and "unoptimizing" of a kprobe, can try to
      modify the text after it has been freed by the init code.
      
      The kprobe trace selftest is a special case, and Thomas and myself
      investigated to see if there's a chance that this could also be a bug
      with module unloading, as the code is not obvious to how it handles
      this. After adding lots of printks, I figured it out. Thomas suggested
      that this should be commented so that others will not have to go
      through this exercise again.
      
      Link: http://lkml.kernel.org/r/20170516145835.3827d3aa@gandalf.local.homeAcked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      545a0281
    • S
      ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub · 8a49f3e0
      Steven Rostedt (VMware) 提交于
      No need to add ugly #ifdefs in the code. Having a standard stub file is much
      prettier.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      8a49f3e0
    • N
      ftrace/instances: Clear function triggers when removing instances · a0e6369e
      Naveen N. Rao 提交于
      If instance directories are deleted while there are registered function
      triggers:
      
        # cd /sys/kernel/debug/tracing/instances
        # mkdir test
        # echo "schedule:enable_event:sched:sched_switch" > test/set_ftrace_filter
        # rmdir test
        Unable to handle kernel paging request for data at address 0x00000008
        Unable to handle kernel paging request for data at address 0x00000008
        Faulting instruction address: 0xc0000000021edde8
        Oops: Kernel access of bad area, sig: 11 [#1]
        SMP NR_CPUS=2048
        NUMA
        pSeries
        Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc kvm iptable_filter fuse binfmt_misc pseries_rng rng_core vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath virtio_net virtio_blk virtio_pci crc32c_vpmsum virtio_ring virtio
        CPU: 8 PID: 8694 Comm: rmdir Not tainted 4.11.0-nnr+ #113
        task: c0000000bab52800 task.stack: c0000000baba0000
        NIP: c0000000021edde8 LR: c0000000021f0590 CTR: c000000002119620
        REGS: c0000000baba3870 TRAP: 0300   Not tainted  (4.11.0-nnr+)
        MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>
          CR: 22002422  XER: 20000000
        CFAR: 00007fffabb725a8 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 0
        GPR00: c00000000220f750 c0000000baba3af0 c000000003157e00 0000000000000000
        GPR04: 0000000000000040 00000000000000eb 0000000000000040 0000000000000000
        GPR08: 0000000000000000 0000000000000113 0000000000000000 c00000000305db98
        GPR12: c000000002119620 c00000000fd42c00 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 c0000000bab52e90 0000000000000000
        GPR24: 0000000000000000 00000000000000eb 0000000000000040 c0000000baba3bb0
        GPR28: c00000009cb06eb0 c0000000bab52800 c00000009cb06eb0 c0000000baba3bb0
        NIP [c0000000021edde8] ring_buffer_lock_reserve+0x8/0x4e0
        LR [c0000000021f0590] trace_event_buffer_lock_reserve+0xe0/0x1a0
        Call Trace:
        [c0000000baba3af0] [c0000000021f96c8] trace_event_buffer_commit+0x1b8/0x280 (unreliable)
        [c0000000baba3b60] [c00000000220f750] trace_event_buffer_reserve+0x80/0xd0
        [c0000000baba3b90] [c0000000021196b8] trace_event_raw_event_sched_switch+0x98/0x180
        [c0000000baba3c10] [c0000000029d9980] __schedule+0x6e0/0xab0
        [c0000000baba3ce0] [c000000002122230] do_task_dead+0x70/0xc0
        [c0000000baba3d10] [c0000000020ea9c8] do_exit+0x828/0xd00
        [c0000000baba3dd0] [c0000000020eaf70] do_group_exit+0x60/0x100
        [c0000000baba3e10] [c0000000020eb034] SyS_exit_group+0x24/0x30
        [c0000000baba3e30] [c00000000200bcec] system_call+0x38/0x54
        Instruction dump:
        60000000 60420000 7d244b78 7f63db78 4bffaa09 393efff8 793e0020 39200000
        4bfffecc 60420000 3c4c00f7 3842a020 <81230008> 2f890000 409e02f0 a14d0008
        ---[ end trace b917b8985d0e650b ]---
        Unable to handle kernel paging request for data at address 0x00000008
        Faulting instruction address: 0xc0000000021edde8
        Unable to handle kernel paging request for data at address 0x00000008
        Faulting instruction address: 0xc0000000021edde8
        Faulting instruction address: 0xc0000000021edde8
      
      To address this, let's clear all registered function probes before
      deleting the ftrace instance.
      
      Link: http://lkml.kernel.org/r/c5f1ca624043690bd94642bb6bffd3f2fc504035.1494956770.git.naveen.n.rao@linux.vnet.ibm.comReported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      a0e6369e
    • N
    • T
      tracing/kprobes: Enforce kprobes teardown after testing · 30e7d894
      Thomas Gleixner 提交于
      Enabling the tracer selftest triggers occasionally the warning in
      text_poke(), which warns when the to be modified page is not marked
      reserved.
      
      The reason is that the tracer selftest installs kprobes on functions marked
      __init for testing. These probes are removed after the tests, but that
      removal schedules the delayed kprobes_optimizer work, which will do the
      actual text poke. If the work is executed after the init text is freed,
      then the warning triggers. The bug can be reproduced reliably when the work
      delay is increased.
      
      Flush the optimizer work and wait for the optimizing/unoptimizing lists to
      become empty before returning from the kprobes tracer selftest. That
      ensures that all operations which were queued due to the probes removal
      have completed.
      
      Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: 6274de49 ("kprobes: Support delayed unoptimizing")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      30e7d894
    • S
      tracing: Move postpone selftests to core from early_initcall · b9ef0326
      Steven Rostedt 提交于
      I hit the following lockdep splat when booting with ftrace selftests
      enabled, as well as CONFIG_PREEMPT and LOCKDEP.
      
       Testing dynamic ftrace ops #1:
       (1 0 1 0 0)
       (1 1 2 0 0)
       (2 1 3 0 169)
       (2 2 4 0 50066)
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 13 at kernel/rcu/srcutree.c:202 check_init_srcu_struct+0x60/0x70
       Modules linked in:
       CPU: 0 PID: 13 Comm: rcu_tasks_kthre Not tainted 4.12.0-rc1-test+ #587
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880119628040 task.stack: ffffc900006a4000
       RIP: 0010:check_init_srcu_struct+0x60/0x70
       RSP: 0000:ffffc900006a7d98 EFLAGS: 00010246
       RAX: 0000000000000246 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: ffff880119628040 RSI: 00000000ffffffff RDI: ffffffff81e5fb40
       RBP: ffffc900006a7e20 R08: 00000023b403c000 R09: 0000000000000001
       R10: ffffc900006a7e40 R11: 0000000000000000 R12: ffffffff81e5fb40
       R13: 0000000000000286 R14: ffff880119628040 R15: ffffc900006a7e98
       FS:  0000000000000000(0000) GS:ffff88011ea00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffff88011edff000 CR3: 0000000001e0f000 CR4: 00000000001406f0
       Call Trace:
        ? __synchronize_srcu+0x6e/0x140
        ? lock_acquire+0xdc/0x1d0
        ? ktime_get_mono_fast_ns+0x5d/0xb0
        synchronize_srcu+0x6f/0x110
        ? synchronize_srcu+0x6f/0x110
        rcu_tasks_kthread+0x20a/0x540
        kthread+0x114/0x150
        ? __rcu_read_unlock+0x70/0x70
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x2e/0x40
       Code: f6 83 70 06 00 00 03 49 89 c5 74 0d be 01 00 00 00 48 89 df e8 42 fa ff ff 4c 89 ee 4c 89 e7 e8 b7 42 75 00 5b 41 5c 41 5d 5d c3 <0f> ff eb aa 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
       ---[ end trace 5c3f4206ce50f6ac ]---
      
      What happens is that the selftests include a creating of a dynamically
      allocated ftrace_ops, which requires the use of synchronize_rcu_tasks()
      which uses srcu, and triggers the above warning.
      
      It appears that synchronize_rcu_tasks() is not set up at early_initcall(),
      but it is at core_initcall(). By moving the tests down to that location
      works out properly.
      
      Link: http://lkml.kernel.org/r/20170517111435.7388c033@gandalf.local.homeAcked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      b9ef0326
  6. 16 5月, 2017 1 次提交
  7. 15 5月, 2017 6 次提交