1. 20 7月, 2021 1 次提交
    • V
      memcg: enable accounting for IP address and routing-related objects · 6126891c
      Vasily Averin 提交于
      An netadmin inside container can use 'ip a a' and 'ip r a'
      to assign a large number of ipv4/ipv6 addresses and routing entries
      and force kernel to allocate megabytes of unaccounted memory
      for long-lived per-netdevice related kernel objects:
      'struct in_ifaddr', 'struct inet6_ifaddr', 'struct fib6_node',
      'struct rt6_info', 'struct fib_rules' and ip_fib caches.
      
      These objects can be manually removed, though usually they lives
      in memory till destroy of its net namespace.
      
      It makes sense to account for them to restrict the host's memory
      consumption from inside the memcg-limited container.
      
      One of such objects is the 'struct fib6_node' mostly allocated in
      net/ipv6/route.c::__ip6_ins_rt() inside the lock_bh()/unlock_bh() section:
      
       write_lock_bh(&table->tb6_lock);
       err = fib6_add(&table->tb6_root, rt, info, mxc);
       write_unlock_bh(&table->tb6_lock);
      
      In this case it is not enough to simply add SLAB_ACCOUNT to corresponding
      kmem cache. The proper memory cgroup still cannot be found due to the
      incorrect 'in_interrupt()' check used in memcg_kmem_bypass().
      
      Obsoleted in_interrupt() does not describe real execution context properly.
      >From include/linux/preempt.h:
      
       The following macros are deprecated and should not be used in new code:
       in_interrupt()	- We're in NMI,IRQ,SoftIRQ context or have BH disabled
      
      To verify the current execution context new macro should be used instead:
       in_task()	- We're in task context
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6126891c
  2. 18 7月, 2021 1 次提交
    • L
      igmp: Add ip_mc_list lock in ip_check_mc_rcu · 23d2b940
      Liu Jian 提交于
      I got below panic when doing fuzz test:
      
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 4056 Comm: syz-executor.3 Tainted: G    B             5.14.0-rc1-00195-gcff5c4254439-dirty #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
      Call Trace:
      dump_stack_lvl+0x7a/0x9b
      panic+0x2cd/0x5af
      end_report.cold+0x5a/0x5a
      kasan_report+0xec/0x110
      ip_check_mc_rcu+0x556/0x5d0
      __mkroute_output+0x895/0x1740
      ip_route_output_key_hash_rcu+0x2d0/0x1050
      ip_route_output_key_hash+0x182/0x2e0
      ip_route_output_flow+0x28/0x130
      udp_sendmsg+0x165d/0x2280
      udpv6_sendmsg+0x121e/0x24f0
      inet6_sendmsg+0xf7/0x140
      sock_sendmsg+0xe9/0x180
      ____sys_sendmsg+0x2b8/0x7a0
      ___sys_sendmsg+0xf0/0x160
      __sys_sendmmsg+0x17e/0x3c0
      __x64_sys_sendmmsg+0x9e/0x100
      do_syscall_64+0x3b/0x90
      entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x462eb9
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8
       48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48>
       3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f3df5af1c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462eb9
      RDX: 0000000000000312 RSI: 0000000020001700 RDI: 0000000000000007
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3df5af26bc
      R13: 00000000004c372d R14: 0000000000700b10 R15: 00000000ffffffff
      
      It is one use-after-free in ip_check_mc_rcu.
      In ip_mc_del_src, the ip_sf_list of pmc has been freed under pmc->lock protection.
      But access to ip_sf_list in ip_check_mc_rcu is not protected by the lock.
      Signed-off-by: NLiu Jian <liujian56@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23d2b940
  3. 16 7月, 2021 1 次提交
  4. 14 7月, 2021 1 次提交
  5. 10 7月, 2021 3 次提交
  6. 09 7月, 2021 1 次提交
    • E
      ipv6: tcp: drop silly ICMPv6 packet too big messages · c7bb4b89
      Eric Dumazet 提交于
      While TCP stack scales reasonably well, there is still one part that
      can be used to DDOS it.
      
      IPv6 Packet too big messages have to lookup/insert a new route,
      and if abused by attackers, can easily put hosts under high stress,
      with many cpus contending on a spinlock while one is stuck in fib6_run_gc()
      
      ip6_protocol_deliver_rcu()
       icmpv6_rcv()
        icmpv6_notify()
         tcp_v6_err()
          tcp_v6_mtu_reduced()
           inet6_csk_update_pmtu()
            ip6_rt_update_pmtu()
             __ip6_rt_update_pmtu()
              ip6_rt_cache_alloc()
               ip6_dst_alloc()
                dst_alloc()
                 ip6_dst_gc()
                  fib6_run_gc()
                   spin_lock_bh() ...
      
      Some of our servers have been hit by malicious ICMPv6 packets
      trying to _increase_ the MTU/MSS of TCP flows.
      
      We believe these ICMPv6 packets are a result of a bug in one ISP stack,
      since they were blindly sent back for _every_ (small) packet sent to them.
      
      These packets are for one TCP flow:
      09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      
      TCP stack can filter some silly requests :
      
      1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
      2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.
      
      This tests happen before the IPv6 routing stack is entered, thus
      removing the potential contention and route exhaustion.
      
      Note that IPv6 stack was performing these checks, but too late
      (ie : after the route has been added, and after the potential
      garbage collect war)
      
      v2: fix typo caught by Martin, thanks !
      v3: exports tcp_mtu_to_mss(), caught by David, thanks !
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NMaciej Żenczykowski <maze@google.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7bb4b89
  7. 08 7月, 2021 1 次提交
  8. 07 7月, 2021 1 次提交
    • N
      tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized · be5d1b61
      Nguyen Dinh Phi 提交于
      This commit fixes a bug (found by syzkaller) that could cause spurious
      double-initializations for congestion control modules, which could cause
      memory leaks or other problems for congestion control modules (like CDG)
      that allocate memory in their init functions.
      
      The buggy scenario constructed by syzkaller was something like:
      
      (1) create a TCP socket
      (2) initiate a TFO connect via sendto()
      (3) while socket is in TCP_SYN_SENT, call setsockopt(TCP_CONGESTION),
          which calls:
             tcp_set_congestion_control() ->
               tcp_reinit_congestion_control() ->
                 tcp_init_congestion_control()
      (4) receive ACK, connection is established, call tcp_init_transfer(),
          set icsk_ca_initialized=0 (without first calling cc->release()),
          call tcp_init_congestion_control() again.
      
      Note that in this sequence tcp_init_congestion_control() is called
      twice without a cc->release() call in between. Thus, for CC modules
      that allocate memory in their init() function, e.g, CDG, a memory leak
      may occur. The syzkaller tool managed to find a reproducer that
      triggered such a leak in CDG.
      
      The bug was introduced when that commit 8919a9b3 ("tcp: Only init
      congestion control if not initialized already")
      introduced icsk_ca_initialized and set icsk_ca_initialized to 0 in
      tcp_init_transfer(), missing the possibility for a sequence like the
      one above, where a process could call setsockopt(TCP_CONGESTION) in
      state TCP_SYN_SENT (i.e. after the connect() or TFO open sendmsg()),
      which would call tcp_init_congestion_control(). It did not intend to
      reset any initialization that the user had already explicitly made;
      it just missed the possibility of that particular sequence (which
      syzkaller managed to find).
      
      Fixes: 8919a9b3 ("tcp: Only init congestion control if not initialized already")
      Reported-by: syzbot+f1e24a0594d4e3a895d3@syzkaller.appspotmail.com
      Signed-off-by: NNguyen Dinh Phi <phind.uet@gmail.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be5d1b61
  9. 03 7月, 2021 2 次提交
  10. 02 7月, 2021 1 次提交
    • E
      udp: annotate data races around unix_sk(sk)->gso_size · 18a419ba
      Eric Dumazet 提交于
      Accesses to unix_sk(sk)->gso_size are lockless.
      Add READ_ONCE()/WRITE_ONCE() around them.
      
      BUG: KCSAN: data-race in udp_lib_setsockopt / udpv6_sendmsg
      
      write to 0xffff88812d78f47c of 2 bytes by task 10849 on cpu 1:
       udp_lib_setsockopt+0x3b3/0x710 net/ipv4/udp.c:2696
       udpv6_setsockopt+0x63/0x90 net/ipv6/udp.c:1630
       sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3265
       __sys_setsockopt+0x18f/0x200 net/socket.c:2104
       __do_sys_setsockopt net/socket.c:2115 [inline]
       __se_sys_setsockopt net/socket.c:2112 [inline]
       __x64_sys_setsockopt+0x62/0x70 net/socket.c:2112
       do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff88812d78f47c of 2 bytes by task 10852 on cpu 0:
       udpv6_sendmsg+0x161/0x16b0 net/ipv6/udp.c:1299
       inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:642
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x360/0x4d0 net/socket.c:2337
       ___sys_sendmsg net/socket.c:2391 [inline]
       __sys_sendmmsg+0x315/0x4b0 net/socket.c:2477
       __do_sys_sendmmsg net/socket.c:2506 [inline]
       __se_sys_sendmmsg net/socket.c:2503 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2503
       do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x0000 -> 0x0005
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 10852 Comm: syz-executor.0 Not tainted 5.13.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: bec1f6f6 ("udp: generate gso with UDP_SEGMENT")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18a419ba
  11. 30 6月, 2021 3 次提交
  12. 29 6月, 2021 4 次提交
  13. 25 6月, 2021 1 次提交
    • J
      net: ip: avoid OOM kills with large UDP sends over loopback · 6d123b81
      Jakub Kicinski 提交于
      Dave observed number of machines hitting OOM on the UDP send
      path. The workload seems to be sending large UDP packets over
      loopback. Since loopback has MTU of 64k kernel will try to
      allocate an skb with up to 64k of head space. This has a good
      chance of failing under memory pressure. What's worse if
      the message length is <32k the allocation may trigger an
      OOM killer.
      
      This is entirely avoidable, we can use an skb with page frags.
      
      af_unix solves a similar problem by limiting the head
      length to SKB_MAX_ALLOC. This seems like a good and simple
      approach. It means that UDP messages > 16kB will now
      use fragments if underlying device supports SG, if extra
      allocator pressure causes regressions in real workloads
      we can switch to trying the large allocation first and
      falling back.
      
      v4: pre-calculate all the additions to alloclen so
          we can be sure it won't go over order-2
      Reported-by: NDave Jones <dsj@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d123b81
  14. 24 6月, 2021 1 次提交
  15. 23 6月, 2021 1 次提交
  16. 21 6月, 2021 2 次提交
  17. 19 6月, 2021 1 次提交
    • T
      icmp: don't send out ICMP messages with a source address of 0.0.0.0 · 32182747
      Toke Høiland-Jørgensen 提交于
      When constructing ICMP response messages, the kernel will try to pick a
      suitable source address for the outgoing packet. However, if no IPv4
      addresses are configured on the system at all, this will fail and we end up
      producing an ICMP message with a source address of 0.0.0.0. This can happen
      on a box routing IPv4 traffic via v6 nexthops, for instance.
      
      Since 0.0.0.0 is not generally routable on the internet, there's a good
      chance that such ICMP messages will never make it back to the sender of the
      original packet that the ICMP message was sent in response to. This, in
      turn, can create connectivity and PMTUd problems for senders. Fortunately,
      RFC7600 reserves a dummy address to be used as a source for ICMP
      messages (192.0.0.8/32), so let's teach the kernel to substitute that
      address as a last resort if the regular source address selection procedure
      fails.
      
      Below is a quick example reproducing this issue with network namespaces:
      
      ip netns add ns0
      ip l add type veth peer netns ns0
      ip l set dev veth0 up
      ip a add 10.0.0.1/24 dev veth0
      ip a add fc00:dead:cafe:42::1/64 dev veth0
      ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2
      ip -n ns0 l set dev veth0 up
      ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0
      ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1
      ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0
      ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1
      tcpdump -tpni veth0 -c 2 icmp &
      ping -w 1 10.1.0.1 > /dev/null
      tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
      listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
      IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64
      IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
      2 packets captured
      2 packets received by filter
      0 packets dropped by kernel
      
      With this patch the above capture changes to:
      IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64
      IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: NJuliusz Chroboczek <jch@irif.fr>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32182747
  18. 17 6月, 2021 1 次提交
    • C
      net: ipv4: fix memory leak in ip_mc_add1_src · d8e29730
      Chengyang Fan 提交于
      BUG: memory leak
      unreferenced object 0xffff888101bc4c00 (size 32):
        comm "syz-executor527", pid 360, jiffies 4294807421 (age 19.329s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
          01 00 00 00 00 00 00 00 ac 14 14 bb 00 00 02 00 ................
        backtrace:
          [<00000000f17c5244>] kmalloc include/linux/slab.h:558 [inline]
          [<00000000f17c5244>] kzalloc include/linux/slab.h:688 [inline]
          [<00000000f17c5244>] ip_mc_add1_src net/ipv4/igmp.c:1971 [inline]
          [<00000000f17c5244>] ip_mc_add_src+0x95f/0xdb0 net/ipv4/igmp.c:2095
          [<000000001cb99709>] ip_mc_source+0x84c/0xea0 net/ipv4/igmp.c:2416
          [<0000000052cf19ed>] do_ip_setsockopt net/ipv4/ip_sockglue.c:1294 [inline]
          [<0000000052cf19ed>] ip_setsockopt+0x114b/0x30c0 net/ipv4/ip_sockglue.c:1423
          [<00000000477edfbc>] raw_setsockopt+0x13d/0x170 net/ipv4/raw.c:857
          [<00000000e75ca9bb>] __sys_setsockopt+0x158/0x270 net/socket.c:2117
          [<00000000bdb993a8>] __do_sys_setsockopt net/socket.c:2128 [inline]
          [<00000000bdb993a8>] __se_sys_setsockopt net/socket.c:2125 [inline]
          [<00000000bdb993a8>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2125
          [<000000006a1ffdbd>] do_syscall_64+0x40/0x80 arch/x86/entry/common.c:47
          [<00000000b11467c4>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      In commit 24803f38 ("igmp: do not remove igmp souce list info when set
      link down"), the ip_mc_clear_src() in ip_mc_destroy_dev() was removed,
      because it was also called in igmpv3_clear_delrec().
      
      Rough callgraph:
      
      inetdev_destroy
      -> ip_mc_destroy_dev
           -> igmpv3_clear_delrec
              -> ip_mc_clear_src
      -> RCU_INIT_POINTER(dev->ip_ptr, NULL)
      
      However, ip_mc_clear_src() called in igmpv3_clear_delrec() doesn't
      release in_dev->mc_list->sources. And RCU_INIT_POINTER() assigns the
      NULL to dev->ip_ptr. As a result, in_dev cannot be obtained through
      inetdev_by_index() and then in_dev->mc_list->sources cannot be released
      by ip_mc_del1_src() in the sock_close. Rough call sequence goes like:
      
      sock_close
      -> __sock_release
         -> inet_release
            -> ip_mc_drop_socket
               -> inetdev_by_index
               -> ip_mc_leave_src
                  -> ip_mc_del_src
                     -> ip_mc_del1_src
      
      So we still need to call ip_mc_clear_src() in ip_mc_destroy_dev() to free
      in_dev->mc_list->sources.
      
      Fixes: 24803f38 ("igmp: do not remove igmp souce list info ...")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NChengyang Fan <cy.fan@huawei.com>
      Acked-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8e29730
  19. 16 6月, 2021 6 次提交
  20. 15 6月, 2021 1 次提交
    • D
      ipv4: Fix device used for dst_alloc with local routes · b87b04f5
      David Ahern 提交于
      Oliver reported a use case where deleting a VRF device can hang
      waiting for the refcnt to drop to 0. The root cause is that the dst
      is allocated against the VRF device but cached on the loopback
      device.
      
      The use case (added to the selftests) has an implicit VRF crossing
      due to the ordering of the FIB rules (lookup local is before the
      l3mdev rule, but the problem occurs even if the FIB rules are
      re-ordered with local after l3mdev because the VRF table does not
      have a default route to terminate the lookup). The end result is
      is that the FIB lookup returns the loopback device as the nexthop,
      but the ingress device is in a VRF. The mismatch causes the dst
      alloc against the VRF device but then cached on the loopback.
      
      The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu):
      pick the dst alloc device based the fib lookup result but with checks
      that the result has a nexthop device (e.g., not an unreachable or
      prohibit entry).
      
      Fixes: f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
      Reported-by: NOliver Herms <oliver.peter.herms@gmail.com>
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b87b04f5
  21. 11 6月, 2021 1 次提交
  22. 10 6月, 2021 2 次提交
    • P
      udp: fix race between close() and udp_abort() · a8b897c7
      Paolo Abeni 提交于
      Kaustubh reported and diagnosed a panic in udp_lib_lookup().
      The root cause is udp_abort() racing with close(). Both
      racing functions acquire the socket lock, but udp{v6}_destroy_sock()
      release it before performing destructive actions.
      
      We can't easily extend the socket lock scope to avoid the race,
      instead use the SOCK_DEAD flag to prevent udp_abort from doing
      any action when the critical race happens.
      Diagnosed-and-tested-by: NKaustubh Pandey <kapandey@codeaurora.org>
      Fixes: 5d77dca8 ("net: diag: support SOCK_DESTROY for UDP sockets")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8b897c7
    • E
      inet: annotate data race in inet_send_prepare() and inet_dgram_connect() · dcd01eea
      Eric Dumazet 提交于
      Both functions are known to be racy when reading inet_num
      as we do not want to grab locks for the common case the socket
      has been bound already. The race is resolved in inet_autobind()
      by reading again inet_num under the socket lock.
      
      syzbot reported:
      BUG: KCSAN: data-race in inet_send_prepare / udp_lib_get_port
      
      write to 0xffff88812cba150e of 2 bytes by task 24135 on cpu 0:
       udp_lib_get_port+0x4b2/0xe20 net/ipv4/udp.c:308
       udp_v6_get_port+0x5e/0x70 net/ipv6/udp.c:89
       inet_autobind net/ipv4/af_inet.c:183 [inline]
       inet_send_prepare+0xd0/0x210 net/ipv4/af_inet.c:807
       inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
       __do_sys_sendmmsg net/socket.c:2519 [inline]
       __se_sys_sendmmsg net/socket.c:2516 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
       do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff88812cba150e of 2 bytes by task 24132 on cpu 1:
       inet_send_prepare+0x21/0x210 net/ipv4/af_inet.c:806
       inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
       __do_sys_sendmmsg net/socket.c:2519 [inline]
       __se_sys_sendmmsg net/socket.c:2516 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
       do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x0000 -> 0x9db4
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 24132 Comm: syz-executor.2 Not tainted 5.13.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcd01eea
  23. 09 6月, 2021 3 次提交
    • F
      xfrm: remove description from xfrm_type struct · 152bca09
      Florian Westphal 提交于
      Its set but never read. Reduces size of xfrm_type to 64 bytes on 64bit.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      152bca09
    • Z
      net: ipv4: Remove unneed BUG() function · 5ac6b198
      Zheng Yongjun 提交于
      When 'nla_parse_nested_deprecated' failed, it's no need to
      BUG() here, return -EINVAL is ok.
      Signed-off-by: NZheng Yongjun <zhengyongjun3@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ac6b198
    • N
      net: ipv4: fix memory leak in netlbl_cipsov4_add_std · d612c3f3
      Nanyong Sun 提交于
      Reported by syzkaller:
      BUG: memory leak
      unreferenced object 0xffff888105df7000 (size 64):
      comm "syz-executor842", pid 360, jiffies 4294824824 (age 22.546s)
      hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      backtrace:
      [<00000000e67ed558>] kmalloc include/linux/slab.h:590 [inline]
      [<00000000e67ed558>] kzalloc include/linux/slab.h:720 [inline]
      [<00000000e67ed558>] netlbl_cipsov4_add_std net/netlabel/netlabel_cipso_v4.c:145 [inline]
      [<00000000e67ed558>] netlbl_cipsov4_add+0x390/0x2340 net/netlabel/netlabel_cipso_v4.c:416
      [<0000000006040154>] genl_family_rcv_msg_doit.isra.0+0x20e/0x320 net/netlink/genetlink.c:739
      [<00000000204d7a1c>] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
      [<00000000204d7a1c>] genl_rcv_msg+0x2bf/0x4f0 net/netlink/genetlink.c:800
      [<00000000c0d6a995>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
      [<00000000d78b9d2c>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
      [<000000009733081b>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
      [<000000009733081b>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
      [<00000000d5fd43b8>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929
      [<000000000a2d1e40>] sock_sendmsg_nosec net/socket.c:654 [inline]
      [<000000000a2d1e40>] sock_sendmsg+0x139/0x170 net/socket.c:674
      [<00000000321d1969>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
      [<00000000964e16bc>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
      [<000000001615e288>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433
      [<000000004ee8b6a5>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47
      [<00000000171c7cee>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The memory of doi_def->map.std pointing is allocated in
      netlbl_cipsov4_add_std, but no place has freed it. It should be
      freed in cipso_v4_doi_free which frees the cipso DOI resource.
      
      Fixes: 96cb8e33 ("[NetLabel]: CIPSOv4 and Unlabeled packet integration")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d612c3f3