1. 03 12月, 2020 1 次提交
  2. 02 12月, 2020 1 次提交
  3. 28 11月, 2020 3 次提交
  4. 27 11月, 2020 1 次提交
  5. 26 11月, 2020 5 次提交
  6. 25 11月, 2020 2 次提交
  7. 24 11月, 2020 3 次提交
    • E
      net/packet: fix packet receive on L3 devices without visible hard header · d5496990
      Eyal Birger 提交于
      In the patchset merged by commit b9fcf0a0
      ("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which
      did not have header_ops were given one for the purpose of protocol parsing
      on af_packet transmit path.
      
      That change made af_packet receive path regard these devices as having a
      visible L3 header and therefore aligned incoming skb->data to point to the
      skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not
      reset their mac_header prior to ingress and therefore their incoming
      packets became malformed.
      
      Ideally these devices would reset their mac headers, or af_packet would be
      able to rely on dev->hard_header_len being 0 for such cases, but it seems
      this is not the case.
      
      Fix by changing af_packet RX ll visibility criteria to include the
      existence of a '.create()' header operation, which is used when creating
      a device hard header - via dev_hard_header() - by upper layers, and does
      not exist in these L3 devices.
      
      As this predicate may be useful in other situations, add it as a common
      dev_has_header() helper in netdevice.h.
      
      Fixes: b9fcf0a0 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'")
      Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
      Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20201121062817.3178900-1-eyal.birger@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d5496990
    • S
      vsock/virtio: discard packets only when socket is really closed · 3fe356d5
      Stefano Garzarella 提交于
      Starting from commit 8692cefc ("virtio_vsock: Fix race condition
      in virtio_transport_recv_pkt"), we discard packets in
      virtio_transport_recv_pkt() if the socket has been released.
      
      When the socket is connected, we schedule a delayed work to wait the
      RST packet from the other peer, also if SHUTDOWN_MASK is set in
      sk->sk_shutdown.
      This is done to complete the virtio-vsock shutdown algorithm, releasing
      the port assigned to the socket definitively only when the other peer
      has consumed all the packets.
      
      If we discard the RST packet received, the socket will be closed only
      when the VSOCK_CLOSE_TIMEOUT is reached.
      
      Sergio discovered the issue while running ab(1) HTTP benchmark using
      libkrun [1] and observing a latency increase with that commit.
      
      To avoid this issue, we discard packet only if the socket is really
      closed (SOCK_DONE flag is set).
      We also set SOCK_DONE in virtio_transport_release() when we don't need
      to wait any packets from the other peer (we didn't schedule the delayed
      work). In this case we remove the socket from the vsock lists, releasing
      the port assigned.
      
      [1] https://github.com/containers/libkrun
      
      Fixes: 8692cefc ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt")
      Cc: justin.he@arm.com
      Reported-by: NSergio Lopez <slp@redhat.com>
      Tested-by: NSergio Lopez <slp@redhat.com>
      Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
      Acked-by: NJia He <justin.he@arm.com>
      Link: https://lore.kernel.org/r/20201120104736.73749-1-sgarzare@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      3fe356d5
    • R
      tcp: fix race condition when creating child sockets from syncookies · 01770a16
      Ricardo Dias 提交于
      When the TCP stack is in SYN flood mode, the server child socket is
      created from the SYN cookie received in a TCP packet with the ACK flag
      set.
      
      The child socket is created when the server receives the first TCP
      packet with a valid SYN cookie from the client. Usually, this packet
      corresponds to the final step of the TCP 3-way handshake, the ACK
      packet. But is also possible to receive a valid SYN cookie from the
      first TCP data packet sent by the client, and thus create a child socket
      from that SYN cookie.
      
      Since a client socket is ready to send data as soon as it receives the
      SYN+ACK packet from the server, the client can send the ACK packet (sent
      by the TCP stack code), and the first data packet (sent by the userspace
      program) almost at the same time, and thus the server will equally
      receive the two TCP packets with valid SYN cookies almost at the same
      instant.
      
      When such event happens, the TCP stack code has a race condition that
      occurs between the momement a lookup is done to the established
      connections hashtable to check for the existence of a connection for the
      same client, and the moment that the child socket is added to the
      established connections hashtable. As a consequence, this race condition
      can lead to a situation where we add two child sockets to the
      established connections hashtable and deliver two sockets to the
      userspace program to the same client.
      
      This patch fixes the race condition by checking if an existing child
      socket exists for the same client when we are adding the second child
      socket to the established connections socket. If an existing child
      socket exists, we drop the packet and discard the second child socket
      to the same client.
      Signed-off-by: NRicardo Dias <rdias@singlestore.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20201120111133.GA67501@rdias-suse-pc.lanSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      01770a16
  8. 22 11月, 2020 1 次提交
  9. 21 11月, 2020 4 次提交
    • A
      tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control · 55472017
      Alexander Duyck 提交于
      When setting congestion control via a BPF program it is seen that the
      SYN/ACK for packets within a given flow will not include the ECT0 flag. A
      bit of simple printk debugging shows that when this is configured without
      BPF we will see the value INET_ECN_xmit value initialized in
      tcp_assign_congestion_control however when we configure this via BPF the
      socket is in the closed state and as such it isn't configured, and I do not
      see it being initialized when we transition the socket into the listen
      state. The result of this is that the ECT0 bit is configured based on
      whatever the default state is for the socket.
      
      Any easy way to reproduce this is to monitor the following with tcpdump:
      tools/testing/selftests/bpf/test_progs -t bpf_tcp_ca
      
      Without this patch the SYN/ACK will follow whatever the default is. If dctcp
      all SYN/ACK packets will have the ECT0 bit set, and if it is not then ECT0
      will be cleared on all SYN/ACK packets. With this patch applied the SYN/ACK
      bit matches the value seen on the other packets in the given stream.
      
      Fixes: 91b5b21c ("bpf: Add support for changing congestion control")
      Signed-off-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      55472017
    • A
      tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header · 861602b5
      Alexander Duyck 提交于
      An issue was recently found where DCTCP SYN/ACK packets did not have the
      ECT bit set in the L3 header. A bit of code review found that the recent
      change referenced below had gone though and added a mask that prevented the
      ECN bits from being populated in the L3 header.
      
      This patch addresses that by rolling back the mask so that it is only
      applied to the flags coming from the incoming TCP request instead of
      applying it to the socket tos/tclass field. Doing this the ECT bits were
      restored in the SYN/ACK packets in my testing.
      
      One thing that is not addressed by this patch set is the fact that
      tcp_reflect_tos appears to be incompatible with ECN based congestion
      avoidance algorithms. At a minimum the feature should likely be documented
      which it currently isn't.
      
      Fixes: ac8f1710 ("tcp: reflect tos value received in SYN to the socket")
      Signed-off-by: NAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      861602b5
    • V
      net/tls: missing received data after fast remote close · 20ffc7ad
      Vadim Fedorenko 提交于
      In case when tcp socket received FIN after some data and the
      parser haven't started before reading data caller will receive
      an empty buffer. This behavior differs from plain TCP socket and
      leads to special treating in user-space.
      The flow that triggers the race is simple. Server sends small
      amount of data right after the connection is configured to use TLS
      and closes the connection. In this case receiver sees TLS Handshake
      data, configures TLS socket right after Change Cipher Spec record.
      While the configuration is in process, TCP socket receives small
      Application Data record, Encrypted Alert record and FIN packet. So
      the TCP socket changes sk_shutdown to RCV_SHUTDOWN and sk_flag with
      SK_DONE bit set. The received data is not parsed upon arrival and is
      never sent to user-space.
      
      Patch unpauses parser directly if we have unparsed data in tcp
      receive queue.
      
      Fixes: fcf4793e ("tls: check RCV_SHUTDOWN in tls_wait_data")
      Signed-off-by: NVadim Fedorenko <vfedorenko@novek.ru>
      Link: https://lore.kernel.org/r/1605801588-12236-1-git-send-email-vfedorenko@novek.ruSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      20ffc7ad
    • A
      rose: Fix Null pointer dereference in rose_send_frame() · 3b3fd068
      Anmol Karn 提交于
      rose_send_frame() dereferences `neigh->dev` when called from
      rose_transmit_clear_request(), and the first occurrence of the
      `neigh` is in rose_loopback_timer() as `rose_loopback_neigh`,
      and it is initialized in rose_add_loopback_neigh() as NULL.
      i.e when `rose_loopback_neigh` used in rose_loopback_timer()
      its `->dev` was still NULL and rose_loopback_timer() was calling
      rose_rx_call_request() without checking for NULL.
      
      - net/rose/rose_link.c
      This bug seems to get triggered in this line:
      
      rose_call = (ax25_address *)neigh->dev->dev_addr;
      
      Fix it by adding NULL checking for `rose_loopback_neigh->dev`
      in rose_loopback_timer().
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Suggested-by: NJakub Kicinski <kuba@kernel.org>
      Reported-by: syzbot+a1c743815982d9496393@syzkaller.appspotmail.com
      Tested-by: syzbot+a1c743815982d9496393@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?id=9d2a7ca8c7f2e4b682c97578dfa3f236258300b3Signed-off-by: NAnmol Karn <anmol.karan123@gmail.com>
      Link: https://lore.kernel.org/r/20201119191043.28813-1-anmol.karan123@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      3b3fd068
  10. 20 11月, 2020 3 次提交
  11. 19 11月, 2020 2 次提交
  12. 18 11月, 2020 10 次提交
  13. 17 11月, 2020 3 次提交
  14. 16 11月, 2020 1 次提交