1. 27 6月, 2018 1 次提交
    • J
      nfp: reject binding to shared blocks · 951a8ee6
      John Hurley 提交于
      TC shared blocks allow multiple qdiscs to be grouped together and filters
      shared between them. Currently the chains of filters attached to a block
      are only flushed when the block is removed. If a qdisc is removed from a
      block but the block still exists, flow del messages are not passed to the
      callback registered for that qdisc. For the NFP, this presents the
      possibility of rules still existing in hw when they should be removed.
      
      Prevent binding to shared blocks until the kernel can send per qdisc del
      messages when block unbinds occur.
      
      tcf_block_shared() was not used outside of the core until now, so also
      add an empty implementation for builds with CONFIG_NET_CLS=n.
      
      Fixes: 48617387 ("net: sched: introduce shared filter blocks infrastructure")
      Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      951a8ee6
  2. 20 6月, 2018 1 次提交
    • E
      net/ipv6: respect rcu grace period before freeing fib6_info · 9b0a8da8
      Eric Dumazet 提交于
      syzbot reported use after free that is caused by fib6_info being
      freed without a proper RCU grace period.
      
      CPU: 0 PID: 1407 Comm: udevd Not tainted 4.17.0+ #39
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1b9/0x294 lib/dump_stack.c:113
       print_address_description+0x6c/0x20b mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
       __read_once_size include/linux/compiler.h:188 [inline]
       find_rr_leaf net/ipv6/route.c:705 [inline]
       rt6_select net/ipv6/route.c:761 [inline]
       fib6_table_lookup+0x12b7/0x14d0 net/ipv6/route.c:1823
       ip6_pol_route+0x1c2/0x1020 net/ipv6/route.c:1856
       ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2082
       fib6_rule_lookup+0x211/0x6d0 net/ipv6/fib6_rules.c:122
       ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2110
       ip6_route_output include/net/ip6_route.h:82 [inline]
       icmpv6_xrlim_allow net/ipv6/icmp.c:211 [inline]
       icmp6_send+0x147c/0x2da0 net/ipv6/icmp.c:535
       icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43
       ip6_link_failure+0xa5/0x790 net/ipv6/route.c:2244
       dst_link_failure include/net/dst.h:427 [inline]
       ndisc_error_report+0xd1/0x1c0 net/ipv6/ndisc.c:695
       neigh_invalidate+0x246/0x550 net/core/neighbour.c:892
       neigh_timer_handler+0xaf9/0xde0 net/core/neighbour.c:978
       call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284
       invoke_softirq kernel/softirq.c:364 [inline]
       irq_exit+0x1d1/0x200 kernel/softirq.c:404
       exiting_irq arch/x86/include/asm/apic.h:527 [inline]
       smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
       </IRQ>
      RIP: 0010:strlen+0x5e/0xa0 lib/string.c:482
      Code: 24 00 74 3b 48 bb 00 00 00 00 00 fc ff df 4c 89 e0 48 83 c0 01 48 89 c2 48 89 c1 48 c1 ea 03 83 e1 07 0f b6 14 1a 38 ca 7f 04 <84> d2 75 23 80 38 00 75 de 48 83 c4 08 4c 29 e0 5b 41 5c 5d c3 48
      RSP: 0018:ffff8801af117850 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
      RAX: ffff880197f53bd0 RBX: dffffc0000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff81c5b06c RDI: ffff880197f53bc0
      RBP: ffff8801af117868 R08: ffff88019a976540 R09: 0000000000000000
      R10: ffff88019a976540 R11: 0000000000000000 R12: ffff880197f53bc0
      R13: ffff880197f53bc0 R14: ffffffff899e4e90 R15: ffff8801d91c6a00
       strlen include/linux/string.h:267 [inline]
       getname_kernel+0x24/0x370 fs/namei.c:218
       open_exec+0x17/0x70 fs/exec.c:882
       load_elf_binary+0x968/0x5610 fs/binfmt_elf.c:780
       search_binary_handler+0x17d/0x570 fs/exec.c:1653
       exec_binprm fs/exec.c:1695 [inline]
       __do_execve_file.isra.35+0x16fe/0x2710 fs/exec.c:1819
       do_execveat_common fs/exec.c:1866 [inline]
       do_execve fs/exec.c:1883 [inline]
       __do_sys_execve fs/exec.c:1964 [inline]
       __se_sys_execve fs/exec.c:1959 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1959
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f1576a46207
      Code: 77 19 f4 48 89 d7 44 89 c0 0f 05 48 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb d8 f7 d8 64 41 89 01 eb df b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 00 8c 2d 00 f7 d8 64 89 02
      RSP: 002b:00007ffff2784568 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
      RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f1576a46207
      RDX: 0000000001215b10 RSI: 00007ffff2784660 RDI: 00007ffff2785670
      RBP: 0000000000625500 R08: 000000000000589c R09: 000000000000589c
      R10: 0000000000000000 R11: 0000000000000202 R12: 0000000001215b10
      R13: 0000000000000007 R14: 0000000001204250 R15: 0000000000000005
      
      Allocated by task 12188:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
       kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
       kmalloc include/linux/slab.h:513 [inline]
       kzalloc include/linux/slab.h:706 [inline]
       fib6_info_alloc+0xbb/0x280 net/ipv6/ip6_fib.c:152
       ip6_route_info_create+0x782/0x2b50 net/ipv6/route.c:3013
       ip6_route_add+0x23/0xb0 net/ipv6/route.c:3154
       ipv6_route_ioctl+0x5a5/0x760 net/ipv6/route.c:3660
       inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
       sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
       sock_ioctl+0x30d/0x680 net/socket.c:1097
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:500 [inline]
       do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
       ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
       __do_sys_ioctl fs/ioctl.c:708 [inline]
       __se_sys_ioctl fs/ioctl.c:706 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 1402:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
       kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
       __cache_free mm/slab.c:3498 [inline]
       kfree+0xd9/0x260 mm/slab.c:3813
       fib6_info_destroy+0x29b/0x350 net/ipv6/ip6_fib.c:207
       fib6_info_release include/net/ip6_fib.h:286 [inline]
       __ip6_del_rt_siblings net/ipv6/route.c:3235 [inline]
       ip6_route_del+0x11c4/0x13b0 net/ipv6/route.c:3316
       ipv6_route_ioctl+0x616/0x760 net/ipv6/route.c:3663
       inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
       sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
       sock_ioctl+0x30d/0x680 net/socket.c:1097
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:500 [inline]
       do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
       ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
       __do_sys_ioctl fs/ioctl.c:708 [inline]
       __se_sys_ioctl fs/ioctl.c:706 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8801b5df2580
       which belongs to the cache kmalloc-256 of size 256
      The buggy address is located 8 bytes inside of
       256-byte region [ffff8801b5df2580, ffff8801b5df2680)
      The buggy address belongs to the page:
      page:ffffea0006d77c80 count:1 mapcount:0 mapping:ffff8801da8007c0 index:0xffff8801b5df2e40
      flags: 0x2fffc0000000100(slab)
      raw: 02fffc0000000100 ffffea0006c5cc48 ffffea0007363308 ffff8801da8007c0
      raw: ffff8801b5df2e40 ffff8801b5df2080 0000000100000006 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8801b5df2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8801b5df2500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      > ffff8801b5df2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
       ffff8801b5df2600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8801b5df2680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
      
      Fixes: a64efe14 ("net/ipv6: introduce fib6_info struct and helpers")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reported-by: syzbot+9e6d75e3edef427ee888@syzkaller.appspotmail.com
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b0a8da8
  3. 15 6月, 2018 1 次提交
  4. 13 6月, 2018 1 次提交
  5. 12 6月, 2018 1 次提交
    • D
      tls: fix NULL pointer dereference on poll · f6fadff3
      Daniel Borkmann 提交于
      While hacking on kTLS, I ran into the following panic from an
      unprivileged netserver / netperf TCP session:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0
        Oops: 0010 [#1] SMP KASAN PTI
        CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139
        Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
        RIP: 0010:          (null)
        Code: Bad RIP value.
        RSP: 0018:ffff88036abcf740 EFLAGS: 00010246
        RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26
        RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280
        RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48
        R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280
        R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350
        FS:  00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         ? tls_sw_poll+0xa4/0x160 [tls]
         ? sock_poll+0x20a/0x680
         ? do_select+0x77b/0x11a0
         ? poll_schedule_timeout.constprop.12+0x130/0x130
         ? pick_link+0xb00/0xb00
         ? read_word_at_a_time+0x13/0x20
         ? vfs_poll+0x270/0x270
         ? deref_stack_reg+0xad/0xe0
         ? __read_once_size_nocheck.constprop.6+0x10/0x10
        [...]
      
      Debugging further, it turns out that calling into ctx->sk_poll() is
      invalid since sk_poll itself is NULL which was saved from the original
      TCP socket in order for tls_sw_poll() to invoke it.
      
      Looks like the recent conversion from poll to poll_mask callback started
      in 15252423 ("net: add support for ->poll_mask in proto_ops") missed
      to eventually convert kTLS, too: TCP's ->poll was converted over to the
      ->poll_mask in commit 2c7d3dac ("net/tcp: convert to ->poll_mask")
      and therefore kTLS wrongly saved the ->poll old one which is now NULL.
      
      Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN |
      POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in
      tcp_poll_mask() as well that is mangled here.
      
      Fixes: 2c7d3dac ("net/tcp: convert to ->poll_mask")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Watson <davejwatson@fb.com>
      Tested-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6fadff3
  6. 09 6月, 2018 1 次提交
    • P
      udp: fix rx queue len reported by diag and proc interface · 6c206b20
      Paolo Abeni 提交于
      After commit 6b229cf7 ("udp: add batching to udp_rmem_release()")
      the sk_rmem_alloc field does not measure exactly anymore the
      receive queue length, because we batch the rmem release. The issue
      is really apparent only after commit 0d4a6608 ("udp: do rmem bulk
      free even if the rx sk queue is empty"): the user space can easily
      check for an empty socket with not-0 queue length reported by the 'ss'
      tool or the procfs interface.
      
      We need to use a custom UDP helper to report the correct queue length,
      taking into account the forward allocation deficit.
      
      Reported-by: trevor.francis@46labs.com
      Fixes: 6b229cf7 ("UDP: add batching to udp_rmem_release()")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c206b20
  7. 08 6月, 2018 1 次提交
  8. 07 6月, 2018 1 次提交
    • D
      strparser: Add __strp_unpause and use it in ktls. · 7170e604
      Doron Roberts-Kedes 提交于
      strp_unpause queues strp_work in order to parse any messages that
      arrived while the strparser was paused. However, the process invoking
      strp_unpause could eagerly parse a buffered message itself if it held
      the sock lock.
      
      __strp_unpause is an alternative to strp_pause that avoids the scheduling
      overhead that results when a receiving thread unpauses the strparser
      and waits for the next message to be delivered by the workqueue thread.
      
      This patch more than doubled the IOPS achieved in a benchmark of NBD
      traffic encrypted using ktls.
      Signed-off-by: NDoron Roberts-Kedes <doronrk@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7170e604
  9. 06 6月, 2018 1 次提交
  10. 05 6月, 2018 9 次提交
  11. 03 6月, 2018 10 次提交
  12. 02 6月, 2018 1 次提交
  13. 01 6月, 2018 4 次提交
    • J
      ipvs: add ipv6 support to ftp · d12e1229
      Julian Anastasov 提交于
      Add support for FTP commands with extended format (RFC 2428):
      
      - FTP EPRT: IPv4 and IPv6, active mode, similar to PORT
      - FTP EPSV: IPv4 and IPv6, passive mode, similar to PASV.
      EPSV response usually contains only port but we allow real
      server to provide different address
      
      We restrict control and data connection to be from same
      address family.
      
      Allow the "(" and ")" to be optional in PASV response.
      
      Also, add ipvsh argument to the pkt_in/pkt_out handlers to better
      access the payload after transport header.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d12e1229
    • P
      netfilter: nf_tables: fix chain dependency validation · a654de8f
      Pablo Neira Ayuso 提交于
      The following ruleset:
      
       add table ip filter
       add chain ip filter input { type filter hook input priority 4; }
       add chain ip filter ap
       add rule ip filter input jump ap
       add rule ip filter ap masquerade
      
      results in a panic, because the masquerade extension should be rejected
      from the filter chain. The existing validation is missing a chain
      dependency check when the rule is added to the non-base chain.
      
      This patch fixes the problem by walking down the rules from the
      basechains, searching for either immediate or lookup expressions, then
      jumping to non-base chains and again walking down the rules to perform
      the expression validation, so we make sure the full ruleset graph is
      validated. This is done only once from the commit phase, in case of
      problem, we abort the transaction and perform fine grain validation for
      error reporting. This patch requires 00308791 ("netfilter:
      nfnetlink: allow commit to fail") to achieve this behaviour.
      
      This patch also adds a cleanup callback to nfnl batch interface to reset
      the validate state from the exit path.
      
      As a result of this patch, nf_tables_check_loops() doesn't use
      ->validate to check for loops, instead it just checks for immediate
      expressions.
      Reported-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a654de8f
    • K
      rtnetlink: Remove VLA usage · ccf8dbcd
      Kees Cook 提交于
      In the quest to remove all stack VLA usage from the kernel[1], this
      allocates the maximum size expected for all possible types and adds
      sanity-checks at both registration and usage to make sure nothing gets
      out of sync. This matches the proposed VLA solution for nfnetlink[2]. The
      values chosen here were based on finding assignments for .maxtype and
      .slave_maxtype and manually counting the enums:
      
      slave_maxtype (max 33):
      	IFLA_BRPORT_MAX     33
      	IFLA_BOND_SLAVE_MAX  9
      
      maxtype (max 45):
      	IFLA_BOND_MAX       28
      	IFLA_BR_MAX         45
      	__IFLA_CAIF_HSI_MAX  8
      	IFLA_CAIF_MAX        4
      	IFLA_CAN_MAX        16
      	IFLA_GENEVE_MAX     12
      	IFLA_GRE_MAX        25
      	IFLA_GTP_MAX         5
      	IFLA_HSR_MAX         7
      	IFLA_IPOIB_MAX       4
      	IFLA_IPTUN_MAX      21
      	IFLA_IPVLAN_MAX      3
      	IFLA_MACSEC_MAX     15
      	IFLA_MACVLAN_MAX     7
      	IFLA_PPP_MAX         2
      	__IFLA_RMNET_MAX     4
      	IFLA_VLAN_MAX        6
      	IFLA_VRF_MAX         2
      	IFLA_VTI_MAX         7
      	IFLA_VXLAN_MAX      28
      	VETH_INFO_MAX        2
      	VXCAN_INFO_MAX       2
      
      This additionally changes maxtype and slave_maxtype fields to unsigned,
      since they're only ever using positive values.
      
      [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
      [2] https://patchwork.kernel.org/patch/10439647/Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccf8dbcd
    • Y
      tcp: minor optimization around tcp_hdr() usage in receive path · 3d97d88e
      Yafang Shao 提交于
      This is additional to the
      commit ea1627c2 ("tcp: minor optimizations around tcp_hdr() usage").
      At this point, skb->data is same with tcp_hdr() as tcp header has not
      been pulled yet. So use the less expensive one to get the tcp header.
      
      Remove the third parameter of tcp_rcv_established() and put it into
      the function body.
      
      Furthermore, the local variables are listed as a reverse christmas tree :)
      
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d97d88e
  14. 29 5月, 2018 7 次提交