1. 31 5月, 2019 1 次提交
    • Y
      ipv4: tcp_input: fix stack out of bounds when parsing TCP options. · 9609dad2
      Young Xiao 提交于
      The TCP option parsing routines in tcp_parse_options function could
      read one byte out of the buffer of the TCP options.
      
      1         while (length > 0) {
      2                 int opcode = *ptr++;
      3                 int opsize;
      4
      5                 switch (opcode) {
      6                 case TCPOPT_EOL:
      7                         return;
      8                 case TCPOPT_NOP:        /* Ref: RFC 793 section 3.1 */
      9                         length--;
      10                        continue;
      11                default:
      12                        opsize = *ptr++; //out of bound access
      
      If length = 1, then there is an access in line2.
      And another access is occurred in line 12.
      This would lead to out-of-bound access.
      
      Therefore, in the patch we check that the available data length is
      larger enough to pase both TCP option code and size.
      Signed-off-by: NYoung Xiao <92siuyang@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9609dad2
  2. 26 5月, 2019 1 次提交
  3. 23 5月, 2019 2 次提交
    • E
      ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST · 903869bd
      Eric Dumazet 提交于
      ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST
      
      Fixes: 3580d04a ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      903869bd
    • E
      ipv4/igmp: fix another memory leak in igmpv3_del_delrec() · 3580d04a
      Eric Dumazet 提交于
      syzbot reported memory leaks [1] that I have back tracked to
      a missing cleanup from igmpv3_del_delrec() when
      (im->sfmode != MCAST_INCLUDE)
      
      Add ip_sf_list_clear_all() and kfree_pmc() helpers to explicitely
      handle the cleanups before freeing.
      
      [1]
      
      BUG: memory leak
      unreferenced object 0xffff888123e32b00 (size 64):
        comm "softirq", pid 0, jiffies 4294942968 (age 8.010s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 e0 00 00 01 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000006105011b>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
          [<000000006105011b>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<000000006105011b>] slab_alloc mm/slab.c:3326 [inline]
          [<000000006105011b>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
          [<000000004bba8073>] kmalloc include/linux/slab.h:547 [inline]
          [<000000004bba8073>] kzalloc include/linux/slab.h:742 [inline]
          [<000000004bba8073>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
          [<000000004bba8073>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
          [<00000000a46a65a0>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
          [<000000005956ca89>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:957
          [<00000000848e2d2f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
          [<00000000b9db185c>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
          [<000000003028e438>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
          [<0000000015b65589>] __sys_setsockopt+0x98/0x120 net/socket.c:2078
          [<00000000ac198ef0>] __do_sys_setsockopt net/socket.c:2089 [inline]
          [<00000000ac198ef0>] __se_sys_setsockopt net/socket.c:2086 [inline]
          [<00000000ac198ef0>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
          [<000000000a770437>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
          [<00000000d3adb93b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 9c8bb163 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hangbin Liu <liuhangbin@gmail.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3580d04a
  4. 21 5月, 2019 6 次提交
  5. 20 5月, 2019 1 次提交
  6. 17 5月, 2019 1 次提交
  7. 16 5月, 2019 2 次提交
  8. 15 5月, 2019 1 次提交
  9. 14 5月, 2019 1 次提交
    • J
      bpf: sockmap remove duplicate queue free · c42253cc
      John Fastabend 提交于
      In tcp bpf remove we free the cork list and purge the ingress msg
      list. However we do this before the ref count reaches zero so it
      could be possible some other access is in progress. In this case
      (tcp close and/or tcp_unhash) we happen to also hold the sock
      lock so no path exists but lets fix it otherwise it is extremely
      fragile and breaks the reference counting rules. Also we already
      check the cork list and ingress msg queue and free them once the
      ref count reaches zero so its wasteful to check twice.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c42253cc
  10. 10 5月, 2019 1 次提交
  11. 09 5月, 2019 1 次提交
  12. 06 5月, 2019 2 次提交
  13. 05 5月, 2019 3 次提交
  14. 04 5月, 2019 1 次提交
    • D
      ipmr_base: Do not reset index in mr_table_dump · 7fcd1e03
      David Ahern 提交于
      e is the counter used to save the location of a dump when an
      skb is filled. Once the walk of the table is complete, mr_table_dump
      needs to return without resetting that index to 0. Dump of a specific
      table is looping because of the reset because there is no way to
      indicate the walk of the table is done.
      
      Move the reset to the caller so the dump of each table starts at 0,
      but the loop counter is maintained if a dump fills an skb.
      
      Fixes: e1cedae1 ("ipmr: Refactor mr_rtm_dumproute")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fcd1e03
  15. 02 5月, 2019 2 次提交
    • E
      udp: fix GRO packet of death · 4dd2b82d
      Eric Dumazet 提交于
      syzbot was able to crash host by sending UDP packets with a 0 payload.
      
      TCP does not have this issue since we do not aggregate packets without
      payload.
      
      Since dev_gro_receive() sets gso_size based on skb_gro_len(skb)
      it seems not worth trying to cope with padded packets.
      
      BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
      Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889
      
      CPU: 0 PID: 7889 Comm: syz-executor612 Not tainted 5.1.0-rc7+ #96
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       __asan_report_load16_noabort+0x14/0x20 mm/kasan/generic_report.c:133
       skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
       udp_gro_receive_segment net/ipv4/udp_offload.c:382 [inline]
       call_gro_receive include/linux/netdevice.h:2349 [inline]
       udp_gro_receive+0xb61/0xfd0 net/ipv4/udp_offload.c:414
       udp4_gro_receive+0x763/0xeb0 net/ipv4/udp_offload.c:478
       inet_gro_receive+0xe72/0x1110 net/ipv4/af_inet.c:1510
       dev_gro_receive+0x1cd0/0x23c0 net/core/dev.c:5581
       napi_gro_frags+0x36b/0xd10 net/core/dev.c:5843
       tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
       call_write_iter include/linux/fs.h:1866 [inline]
       do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
       do_iter_write fs/read_write.c:957 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:938
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
       do_writev+0x15e/0x370 fs/read_write.c:1037
       __do_sys_writev fs/read_write.c:1110 [inline]
       __se_sys_writev fs/read_write.c:1107 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441cc0
      Code: 05 48 3d 01 f0 ff ff 0f 83 9d 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d 51 93 29 00 00 75 14 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 74 09 fc ff c3 48 83 ec 08 e8 ba 2b 00 00
      RSP: 002b:00007ffe8c716118 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 00007ffe8c716150 RCX: 0000000000441cc0
      RDX: 0000000000000001 RSI: 00007ffe8c716170 RDI: 00000000000000f0
      RBP: 0000000000000000 R08: 000000000000ffff R09: 0000000000a64668
      R10: 0000000020000040 R11: 0000000000000246 R12: 000000000000c2d9
      R13: 0000000000402b50 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 5143:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_kmalloc mm/kasan/common.c:497 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
       kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
       slab_post_alloc_hook mm/slab.h:437 [inline]
       slab_alloc mm/slab.c:3393 [inline]
       kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
       mm_alloc+0x1d/0xd0 kernel/fork.c:1030
       bprm_mm_init fs/exec.c:363 [inline]
       __do_execve_file.isra.0+0xaa3/0x23f0 fs/exec.c:1791
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 5351:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
       __cache_free mm/slab.c:3499 [inline]
       kmem_cache_free+0x86/0x260 mm/slab.c:3765
       __mmdrop+0x238/0x320 kernel/fork.c:677
       mmdrop include/linux/sched/mm.h:49 [inline]
       finish_task_switch+0x47b/0x780 kernel/sched/core.c:2746
       context_switch kernel/sched/core.c:2880 [inline]
       __schedule+0x81b/0x1cc0 kernel/sched/core.c:3518
       preempt_schedule_irq+0xb5/0x140 kernel/sched/core.c:3745
       retint_kernel+0x1b/0x2d
       arch_local_irq_restore arch/x86/include/asm/paravirt.h:767 [inline]
       kmem_cache_free+0xab/0x260 mm/slab.c:3766
       anon_vma_chain_free mm/rmap.c:134 [inline]
       unlink_anon_vmas+0x2ba/0x870 mm/rmap.c:401
       free_pgtables+0x1af/0x2f0 mm/memory.c:394
       exit_mmap+0x2d1/0x530 mm/mmap.c:3144
       __mmput kernel/fork.c:1046 [inline]
       mmput+0x15f/0x4c0 kernel/fork.c:1067
       exec_mmap fs/exec.c:1046 [inline]
       flush_old_exec+0x8d9/0x1c20 fs/exec.c:1279
       load_elf_binary+0x9bc/0x53f0 fs/binfmt_elf.c:864
       search_binary_handler fs/exec.c:1656 [inline]
       search_binary_handler+0x17f/0x570 fs/exec.c:1634
       exec_binprm fs/exec.c:1698 [inline]
       __do_execve_file.isra.0+0x1394/0x23f0 fs/exec.c:1818
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff88808893f7c0
       which belongs to the cache mm_struct of size 1496
      The buggy address is located 600 bytes to the right of
       1496-byte region [ffff88808893f7c0, ffff88808893fd98)
      The buggy address belongs to the page:
      page:ffffea0002224f80 count:1 mapcount:0 mapping:ffff88821bc40ac0 index:0xffff88808893f7c0 compound_mapcount: 0
      flags: 0x1fffc0000010200(slab|head)
      raw: 01fffc0000010200 ffffea00025b4f08 ffffea00027b9d08 ffff88821bc40ac0
      raw: ffff88808893f7c0 ffff88808893e440 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                                   ^
       ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dd2b82d
    • S
      ipv4: ip_do_fragment: Preserve skb_iif during fragmentation · d2f0c961
      Shmulik Ladkani 提交于
      Previously, during fragmentation after forwarding, skb->skb_iif isn't
      preserved, i.e. 'ip_copy_metadata' does not copy skb_iif from given
      'from' skb.
      
      As a result, ip_do_fragment's creates fragments with zero skb_iif,
      leading to inconsistent behavior.
      
      Assume for example an eBPF program attached at tc egress (post
      forwarding) that examines __sk_buff->ingress_ifindex:
       - the correct iif is observed if forwarding path does not involve
         fragmentation/refragmentation
       - a bogus iif is observed if forwarding path involves
         fragmentation/refragmentatiom
      
      Fix, by preserving skb_iif during 'ip_copy_metadata'.
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2f0c961
  16. 01 5月, 2019 8 次提交
  17. 30 4月, 2019 2 次提交
  18. 28 4月, 2019 4 次提交
    • P
      udp: fix GRO reception in case of length mismatch · 21f1b8a6
      Paolo Abeni 提交于
      Currently, the UDP GRO code path does bad things on some edge
      conditions - Aggregation can happen even on packet with different
      lengths.
      
      Fix the above by rewriting the 'complete' condition for GRO
      packets. While at it, note explicitly that we allow merging the
      first packet per burst below gso_size.
      Reported-by: NSean Tong <seantong114@gmail.com>
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21f1b8a6
    • J
      genetlink: optionally validate strictly/dumps · ef6243ac
      Johannes Berg 提交于
      Add options to strictly validate messages and dump messages,
      sometimes perhaps validating dump messages non-strictly may
      be required, so add an option for that as well.
      
      Since none of this can really be applied to existing commands,
      set the options everwhere using the following spatch:
      
          @@
          identifier ops;
          expression X;
          @@
          struct genl_ops ops[] = {
          ...,
           {
                  .cmd = X,
          +       .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
                  ...
           },
          ...
          };
      
      For new commands one should just not copy the .validate 'opt-out'
      flags and thus get strict validation.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef6243ac
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de