1. 12 1月, 2021 2 次提交
  2. 02 12月, 2020 1 次提交
  3. 18 11月, 2020 1 次提交
  4. 03 11月, 2020 1 次提交
    • A
      net: add kcov handle to skb extensions · 6370cc3b
      Aleksandr Nogikh 提交于
      Remote KCOV coverage collection enables coverage-guided fuzzing of the
      code that is not reachable during normal system call execution. It is
      especially helpful for fuzzing networking subsystems, where it is
      common to perform packet handling in separate work queues even for the
      packets that originated directly from the user space.
      
      Enable coverage-guided frame injection by adding kcov remote handle to
      skb extensions. Default initialization in __alloc_skb and
      __build_skb_around ensures that no socket buffer that was generated
      during a system call will be missed.
      
      Code that is of interest and that performs packet processing should be
      annotated with kcov_remote_start()/kcov_remote_stop().
      
      An alternative approach is to determine kcov_handle solely on the
      basis of the device/interface that received the specific socket
      buffer. However, in this case it would be impossible to distinguish
      between packets that originated during normal background network
      processes or were intentionally injected from the user space.
      Signed-off-by: NAleksandr Nogikh <nogikh@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      6370cc3b
  5. 04 10月, 2020 1 次提交
    • G
      net/sched: act_vlan: Add {POP,PUSH}_ETH actions · 19fbcb36
      Guillaume Nault 提交于
      Implement TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH, to
      respectively pop and push a base Ethernet header at the beginning of a
      frame.
      
      POP_ETH is just a matter of pulling ETH_HLEN bytes. VLAN tags, if any,
      must be stripped before calling POP_ETH.
      
      PUSH_ETH is restricted to skbs with no mac_header, and only the MAC
      addresses can be configured. The Ethertype is automatically set from
      skb->protocol. These restrictions ensure that all skb's fields remain
      consistent, so that this action can't confuse other part of the
      networking stack (like GSO).
      
      Since openvswitch already had these actions, consolidate the code in
      skbuff.c (like for vlan and mpls push/pop).
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19fbcb36
  6. 01 10月, 2020 1 次提交
    • D
      bpf: Add redirect_neigh helper as redirect drop-in · b4ab3141
      Daniel Borkmann 提交于
      Add a redirect_neigh() helper as redirect() drop-in replacement
      for the xmit side. Main idea for the helper is to be very similar
      in semantics to the latter just that the skb gets injected into
      the neighboring subsystem in order to let the stack do the work
      it knows best anyway to populate the L2 addresses of the packet
      and then hand over to dev_queue_xmit() as redirect() does.
      
      This solves two bigger items: i) skbs don't need to go up to the
      stack on the host facing veth ingress side for traffic egressing
      the container to achieve the same for populating L2 which also
      has the huge advantage that ii) the skb->sk won't get orphaned in
      ip_rcv_core() when entering the IP routing layer on the host stack.
      
      Given that skb->sk neither gets orphaned when crossing the netns
      as per 9c4c3252 ("skbuff: preserve sock reference when scrubbing
      the skb.") the helper can then push the skbs directly to the phys
      device where FQ scheduler can do its work and TCP stack gets proper
      backpressure given we hold on to skb->sk as long as skb is still
      residing in queues.
      
      With the helper used in BPF data path to then push the skb to the
      phys device, I observed a stable/consistent TCP_STREAM improvement
      on veth devices for traffic going container -> host -> host ->
      container from ~10Gbps to ~15Gbps for a single stream in my test
      environment.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: David Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net
      b4ab3141
  7. 10 9月, 2020 1 次提交
  8. 27 8月, 2020 1 次提交
  9. 25 8月, 2020 1 次提交
  10. 24 8月, 2020 1 次提交
  11. 21 8月, 2020 1 次提交
  12. 04 8月, 2020 1 次提交
    • W
      net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct · 038ebb1a
      wenxu 提交于
      When openvswitch conntrack offload with act_ct action. Fragment packets
      defrag in the ingress tc act_ct action and miss the next chain. Then the
      packet pass to the openvswitch datapath without the mru. The over
      mtu packet will be dropped in output action in openvswitch for over mtu.
      
      "kernel: net2: dropped over-mtu packet: 1528 > 1500"
      
      This patch add mru in the tc_skb_ext for adefrag and miss next chain
      situation. And also add mru in the qdisc_skb_cb. The act_ct set the mru
      to the qdisc_skb_cb when the packet defrag. And When the chain miss,
      The mru is set to tc_skb_ext which can be got by ovs datapath.
      
      Fixes: b57dc7c1 ("net/sched: Introduce action ct")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      038ebb1a
  13. 25 7月, 2020 1 次提交
  14. 16 7月, 2020 1 次提交
  15. 30 6月, 2020 1 次提交
    • H
      iov_iter: Move unnecessary inclusion of crypto/hash.h · 7999096f
      Herbert Xu 提交于
      The header file linux/uio.h includes crypto/hash.h which pulls in
      most of the Crypto API.  Since linux/uio.h is used throughout the
      kernel this means that every tiny bit of change to the Crypto API
      causes the entire kernel to get rebuilt.
      
      This patch fixes this by moving it into lib/iov_iter.c instead
      where it is actually used.
      
      This patch also fixes the ifdef to use CRYPTO_HASH instead of just
      CRYPTO which does not guarantee the existence of ahash.
      
      Unfortunately a number of drivers were relying on linux/uio.h to
      provide access to linux/slab.h.  This patch adds inclusions of
      linux/slab.h as detected by build failures.
      
      Also skbuff.h was relying on this to provide a declaration for
      ahash_request.  This patch adds a forward declaration instead.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7999096f
  16. 03 6月, 2020 1 次提交
    • D
      bpf: Fix up bpf_skb_adjust_room helper's skb csum setting · 836e66c2
      Daniel Borkmann 提交于
      Lorenz recently reported:
      
        In our TC classifier cls_redirect [0], we use the following sequence of
        helper calls to decapsulate a GUE (basically IP + UDP + custom header)
        encapsulated packet:
      
          bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO)
          bpf_redirect(skb->ifindex, BPF_F_INGRESS)
      
        It seems like some checksums of the inner headers are not validated in
        this case. For example, a TCP SYN packet with invalid TCP checksum is
        still accepted by the network stack and elicits a SYN ACK. [...]
      
        That is, we receive the following packet from the driver:
      
          | ETH | IP | UDP | GUE | IP | TCP |
          skb->ip_summed == CHECKSUM_UNNECESSARY
      
        ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading.
        On this packet we run skb_adjust_room_mac(-encap_len), and get the following:
      
          | ETH | IP | TCP |
          skb->ip_summed == CHECKSUM_UNNECESSARY
      
        Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing
        into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is
        turned into a no-op due to CHECKSUM_UNNECESSARY.
      
      The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally,
      it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does
      not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original
      skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now
      there is no way to adjust the skb->csum_level. NICs that have checksum offload
      disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected.
      
      Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and
      add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users
      from opting out. Opting out is useful for the case where we don't remove/add
      full protocol headers, or for the case where a user wants to adjust the csum
      level manually e.g. through bpf_csum_level() helper that is added in subsequent
      patch.
      
      The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF
      bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally
      as well but doesn't change layers, only transitions between v4 to v6 and vice
      versa, therefore no adoption is required there.
      
        [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/
      
      Fixes: 2be7e212 ("bpf: add bpf_skb_adjust_room helper")
      Reported-by: NLorenz Bauer <lmb@cloudflare.com>
      Reported-by: NAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/
      Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net
      836e66c2
  17. 02 6月, 2020 1 次提交
  18. 18 5月, 2020 1 次提交
  19. 19 4月, 2020 1 次提交
    • G
      skbuff.h: Replace zero-length array with flexible-array member · 5c91aa1d
      Gustavo A. R. Silva 提交于
      The current codebase makes use of the zero-length array language
      extension to the C90 standard, but the preferred mechanism to declare
      variable-length types such as these ones is a flexible array member[1][2],
      introduced in C99:
      
      struct foo {
              int stuff;
              struct boo array[];
      };
      
      By making use of the mechanism above, we will get a compiler warning
      in case the flexible array does not occur last in the structure, which
      will help us prevent some kind of undefined behavior bugs from being
      inadvertently introduced[3] to the codebase from now on.
      
      Also, notice that, dynamic memory allocations won't be affected by
      this change:
      
      "Flexible array members have incomplete type, and so the sizeof operator
      may not be applied. As a quirk of the original implementation of
      zero-length arrays, sizeof evaluates to zero."[1]
      
      This issue was found with the help of Coccinelle.
      
      [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      5c91aa1d
  20. 07 4月, 2020 1 次提交
  21. 30 3月, 2020 1 次提交
  22. 26 3月, 2020 1 次提交
    • P
      net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build · 2c64605b
      Pablo Neira Ayuso 提交于
      net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’:
          net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’
            pkt->skb->tc_redirected = 1;
                    ^~
          net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’
            pkt->skb->tc_from_ingress = 1;
                    ^~
      
      To avoid a direct dependency with tc actions from netfilter, wrap the
      redirect bits around CONFIG_NET_REDIRECT and move helpers to
      include/linux/skbuff.h. Turn on this toggle from the ifb driver, the
      only existing client of these bits in the tree.
      
      This patch adds skb_set_redirected() that sets on the redirected bit
      on the skbuff, it specifies if the packet was redirect from ingress
      and resets the timestamp (timestamp reset was originally missing in the
      netfilter bugfix).
      
      Fixes: bcfabee1 ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress")
      Reported-by: noreply@ellerman.id.au
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c64605b
  23. 29 2月, 2020 1 次提交
  24. 17 2月, 2020 1 次提交
    • R
      skbuff.h: fix all kernel-doc warnings · d2f273f0
      Randy Dunlap 提交于
      Fix all kernel-doc warnings in <linux/skbuff.h>.
      Fixes these warnings:
      
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'list' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'dev_scratch' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'ip_defrag_offset' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'skb_mstamp_ns' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member '__cloned_offset' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'head_frag' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member '__pkt_type_offset' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'encapsulation' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'encap_hdr_csum' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_valid' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member '__pkt_vlan_present_offset' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'vlan_present' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_complete_sw' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'csum_level' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'inner_protocol_type' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'remcsum_offload' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'sender_cpu' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'reserved_tailroom' not described in 'sk_buff'
      ../include/linux/skbuff.h:890: warning: Function parameter or member 'inner_ipproto' not described in 'sk_buff'
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2f273f0
  25. 06 2月, 2020 1 次提交
    • Q
      skbuff: fix a data race in skb_queue_len() · 86b18aaa
      Qian Cai 提交于
      sk_buff.qlen can be accessed concurrently as noticed by KCSAN,
      
       BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_dgram_sendmsg
      
       read to 0xffff8a1b1d8a81c0 of 4 bytes by task 5371 on cpu 96:
        unix_dgram_sendmsg+0x9a9/0xb70 include/linux/skbuff.h:1821
      				 net/unix/af_unix.c:1761
        ____sys_sendmsg+0x33e/0x370
        ___sys_sendmsg+0xa6/0xf0
        __sys_sendmsg+0x69/0xf0
        __x64_sys_sendmsg+0x51/0x70
        do_syscall_64+0x91/0xb47
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       write to 0xffff8a1b1d8a81c0 of 4 bytes by task 1 on cpu 99:
        __skb_try_recv_from_queue+0x327/0x410 include/linux/skbuff.h:2029
        __skb_try_recv_datagram+0xbe/0x220
        unix_dgram_recvmsg+0xee/0x850
        ____sys_recvmsg+0x1fb/0x210
        ___sys_recvmsg+0xa2/0xf0
        __sys_recvmsg+0x66/0xf0
        __x64_sys_recvmsg+0x51/0x70
        do_syscall_64+0x91/0xb47
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Since only the read is operating as lockless, it could introduce a logic
      bug in unix_recvq_full() due to the load tearing. Fix it by adding
      a lockless variant of skb_queue_len() and unix_recvq_full() where
      READ_ONCE() is on the read while WRITE_ONCE() is on the write similar to
      the commit d7d16a89 ("net: add skb_queue_empty_lockless()").
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86b18aaa
  26. 27 1月, 2020 2 次提交
  27. 15 1月, 2020 1 次提交
  28. 10 1月, 2020 2 次提交
  29. 09 1月, 2020 1 次提交
    • J
      net: introduce skb_list_walk_safe for skb segment walking · dcfea72e
      Jason A. Donenfeld 提交于
      As part of the continual effort to remove direct usage of skb->next and
      skb->prev, this patch adds a helper for iterating through the
      singly-linked variant of skb lists, which are used for lists of GSO
      packet. The name "skb_list_..." has been chosen to match the existing
      function, "kfree_skb_list, which also operates on these singly-linked
      lists, and the "..._walk_safe" part is the same idiom as elsewhere in
      the kernel.
      
      This patch removes the helper from wireguard and puts it into
      linux/skbuff.h, while making it a bit more robust for general usage. In
      particular, parenthesis are added around the macro argument usage, and it
      now accounts for trying to iterate through an already-null skb pointer,
      which will simply run the iteration zero times. This latter enhancement
      means it can be used to replace both do { ... } while and while (...)
      open-coded idioms.
      
      This should take care of these three possible usages, which match all
      current methods of iterations.
      
      skb_list_walk_safe(segs, skb, next) { ... }
      skb_list_walk_safe(skb, skb, next) { ... }
      skb_list_walk_safe(segs, skb, segs) { ... }
      
      Gcc appears to generate efficient code for each of these.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcfea72e
  30. 09 12月, 2019 1 次提交
  31. 05 12月, 2019 1 次提交
    • M
      net: Fixed updating of ethertype in skb_mpls_push() · d04ac224
      Martin Varghese 提交于
      The skb_mpls_push was not updating ethertype of an ethernet packet if
      the packet was originally received from a non ARPHRD_ETHER device.
      
      In the below OVS data path flow, since the device corresponding to
      port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_push function does
      not update the ethertype of the packet even though the previous
      push_eth action had added an ethernet header to the packet.
      
      recirc_id(0),in_port(7),eth_type(0x0800),ipv4(tos=0/0xfc,ttl=64,frag=no),
      actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
      push_mpls(label=13,tc=0,ttl=64,bos=1,eth_type=0x8847),4
      
      Fixes: 8822e270 ("net: core: move push MPLS functionality from OvS to core helper")
      Signed-off-by: NMartin Varghese <martin.varghese@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d04ac224
  32. 03 12月, 2019 1 次提交
    • M
      Fixed updating of ethertype in function skb_mpls_pop · 040b5cfb
      Martin Varghese 提交于
      The skb_mpls_pop was not updating ethertype of an ethernet packet if the
      packet was originally received from a non ARPHRD_ETHER device.
      
      In the below OVS data path flow, since the device corresponding to port 7
      is an l3 device (ARPHRD_NONE) the skb_mpls_pop function does not update
      the ethertype of the packet even though the previous push_eth action had
      added an ethernet header to the packet.
      
      recirc_id(0),in_port(7),eth_type(0x8847),
      mpls(label=12/0xfffff,tc=0/0,ttl=0/0x0,bos=1/1),
      actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
      pop_mpls(eth_type=0x800),4
      
      Fixes: ed246cee ("net: core: move pop MPLS functionality from OvS to core helper")
      Signed-off-by: NMartin Varghese <martin.varghese@nokia.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      040b5cfb
  33. 23 11月, 2019 1 次提交
  34. 15 11月, 2019 1 次提交
    • A
      y2038: socket: use __kernel_old_timespec instead of timespec · df1b4ba9
      Arnd Bergmann 提交于
      The 'timespec' type definition and helpers like ktime_to_timespec()
      or timespec64_to_timespec() should no longer be used in the kernel so
      we can remove them and avoid introducing y2038 issues in new code.
      
      Change the socket code that needs to pass a timespec to user space for
      backward compatibility to use __kernel_old_timespec instead.  This type
      has the same layout but with a clearer defined name.
      
      Slightly reformat tcp_recv_timestamp() for consistency after the removal
      of timespec64_to_timespec().
      Acked-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      df1b4ba9
  35. 08 11月, 2019 1 次提交
    • E
      net: add a READ_ONCE() in skb_peek_tail() · f8cc62ca
      Eric Dumazet 提交于
      skb_peek_tail() can be used without protection of a lock,
      as spotted by KCSAN [1]
      
      In order to avoid load-stearing, add a READ_ONCE()
      
      Note that the corresponding WRITE_ONCE() are already there.
      
      [1]
      BUG: KCSAN: data-race in sk_wait_data / skb_queue_tail
      
      read to 0xffff8880b36a4118 of 8 bytes by task 20426 on cpu 1:
       skb_peek_tail include/linux/skbuff.h:1784 [inline]
       sk_wait_data+0x15b/0x250 net/core/sock.c:2477
       kcm_wait_data+0x112/0x1f0 net/kcm/kcmsock.c:1103
       kcm_recvmsg+0xac/0x320 net/kcm/kcmsock.c:1130
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      write to 0xffff8880b36a4118 of 8 bytes by task 451 on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       skb_queue_tail+0x7e/0xc0 net/core/skbuff.c:3145
       kcm_queue_rcv_skb+0x202/0x310 net/kcm/kcmsock.c:206
       kcm_rcv_strparser+0x74/0x4b0 net/kcm/kcmsock.c:370
       __strp_recv+0x348/0xf50 net/strparser/strparser.c:309
       strp_recv+0x84/0xa0 net/strparser/strparser.c:343
       tcp_read_sock+0x174/0x5c0 net/ipv4/tcp.c:1639
       strp_read_sock+0xd4/0x140 net/strparser/strparser.c:366
       do_strp_work net/strparser/strparser.c:414 [inline]
       strp_work+0x9a/0xe0 net/strparser/strparser.c:423
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 451 Comm: kworker/u4:3 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: kstrp strp_work
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8cc62ca
  36. 29 10月, 2019 1 次提交
    • E
      net: add skb_queue_empty_lockless() · d7d16a89
      Eric Dumazet 提交于
      Some paths call skb_queue_empty() without holding
      the queue lock. We must use a barrier in order
      to not let the compiler do strange things, and avoid
      KCSAN splats.
      
      Adding a barrier in skb_queue_empty() might be overkill,
      I prefer adding a new helper to clearly identify
      points where the callers might be lockless. This might
      help us finding real bugs.
      
      The corresponding WRITE_ONCE() should add zero cost
      for current compilers.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7d16a89
  37. 24 10月, 2019 1 次提交
    • E
      net/flow_dissector: switch to siphash · 55667441
      Eric Dumazet 提交于
      UDP IPv6 packets auto flowlabels are using a 32bit secret
      (static u32 hashrnd in net/core/flow_dissector.c) and
      apply jhash() over fields known by the receivers.
      
      Attackers can easily infer the 32bit secret and use this information
      to identify a device and/or user, since this 32bit secret is only
      set at boot time.
      
      Really, using jhash() to generate cookies sent on the wire
      is a serious security concern.
      
      Trying to change the rol32(hash, 16) in ip6_make_flowlabel() would be
      a dead end. Trying to periodically change the secret (like in sch_sfq.c)
      could change paths taken in the network for long lived flows.
      
      Let's switch to siphash, as we did in commit df453700
      ("inet: switch IP ID generator to siphash")
      
      Using a cryptographically strong pseudo random function will solve this
      privacy issue and more generally remove other weak points in the stack.
      
      Packet schedulers using skb_get_hash_perturb() benefit from this change.
      
      Fixes: b5677416 ("ipv6: Enable auto flow labels by default")
      Fixes: 42240901 ("ipv6: Implement different admin modes for automatic flow labels")
      Fixes: 67800f9b ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
      Fixes: cb1ce2ef ("ipv6: Implement automatic flow label generation on transmit")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJonathan Berger <jonathann1@walla.com>
      Reported-by: NAmit Klein <aksecurity@gmail.com>
      Reported-by: NBenny Pinkas <benny@pinkas.net>
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55667441