1. 19 7月, 2017 3 次提交
  2. 18 7月, 2017 3 次提交
  3. 17 7月, 2017 1 次提交
    • E
      inetpeer: remove AVL implementation in favor of RB tree · b145425f
      Eric Dumazet 提交于
      As discussed in Faro during Netfilter Workshop 2017, RB trees can be
      used with RCU, using a seqlock.
      
      Note that net/rxrpc/conn_service.c is already using this.
      
      This patch converts inetpeer from AVL tree to RB tree, since it allows
      to remove private AVL implementation in favor of shared RB code.
      
      $ size net/ipv4/inetpeer.before net/ipv4/inetpeer.after
         text    data     bss     dec     hex filename
         3195      40     128    3363     d23 net/ipv4/inetpeer.before
         1562      24       0    1586     632 net/ipv4/inetpeer.after
      
      The same technique can be used to speed up
      net/netfilter/nft_set_rbtree.c (removing rwlock contention in fast path)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b145425f
  4. 12 7月, 2017 1 次提交
  5. 06 7月, 2017 1 次提交
  6. 05 7月, 2017 1 次提交
  7. 04 7月, 2017 4 次提交
  8. 03 7月, 2017 1 次提交
  9. 02 7月, 2017 4 次提交
  10. 01 7月, 2017 7 次提交
  11. 30 6月, 2017 2 次提交
  12. 28 6月, 2017 2 次提交
  13. 27 6月, 2017 3 次提交
  14. 26 6月, 2017 1 次提交
  15. 25 6月, 2017 1 次提交
    • J
      net: store port/representator id in metadata_dst · 3fcece12
      Jakub Kicinski 提交于
      Switches and modern SR-IOV enabled NICs may multiplex traffic from Port
      representators and control messages over single set of hardware queues.
      Control messages and muxed traffic may need ordered delivery.
      
      Those requirements make it hard to comfortably use TC infrastructure today
      unless we have a way of attaching metadata to skbs at the upper device.
      Because single set of queues is used for many netdevs stopping TC/sched
      queues of all of them reliably is impossible and lower device has to
      retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on
      the fastpath.
      
      This patch attempts to enable port/representative devs to attach metadata
      to skbs which carry port id.  This way representatives can be queueless and
      all queuing can be performed at the lower netdev in the usual way.
      
      Traffic arriving on the port/representative interfaces will be have
      metadata attached and will subsequently be queued to the lower device for
      transmission.  The lower device should recognize the metadata and translate
      it to HW specific format which is most likely either a special header
      inserted before the network headers or descriptor/metadata fields.
      
      Metadata is associated with the lower device by storing the netdev pointer
      along with port id so that if TC decides to redirect or mirror the new
      netdev will not try to interpret it.
      
      This is mostly for SR-IOV devices since switches don't have lower netdevs
      today.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fcece12
  16. 24 6月, 2017 2 次提交
    • J
      tcp: fix out-of-bounds access in ULP sysctl · 926f38e9
      Jakub Kicinski 提交于
      KASAN reports out-of-bound access in proc_dostring() coming from
      proc_tcp_available_ulp() because in case TCP ULP list is empty
      the buffer allocated for the response will not have anything
      printed into it.  Set the first byte to zero to avoid strlen()
      going out-of-bounds.
      
      Fixes: 734942cc ("tcp: ULP infrastructure")
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      926f38e9
    • M
      net: account for current skb length when deciding about UFO · a5cb659b
      Michal Kubeček 提交于
      Our customer encountered stuck NFS writes for blocks starting at specific
      offsets w.r.t. page boundary caused by networking stack sending packets via
      UFO enabled device with wrong checksum. The problem can be reproduced by
      composing a long UDP datagram from multiple parts using MSG_MORE flag:
      
        sendto(sd, buff, 1000, MSG_MORE, ...);
        sendto(sd, buff, 1000, MSG_MORE, ...);
        sendto(sd, buff, 3000, 0, ...);
      
      Assume this packet is to be routed via a device with MTU 1500 and
      NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
      this condition is tested (among others) to decide whether to call
      ip_ufo_append_data():
      
        ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
      
      At the moment, we already have skb with 1028 bytes of data which is not
      marked for GSO so that the test is false (fragheaderlen is usually 20).
      Thus we append second 1000 bytes to this skb without invoking UFO. Third
      sendto(), however, has sufficient length to trigger the UFO path so that we
      end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
      uses udp_csum() to calculate the checksum but that assumes all fragments
      have correct checksum in skb->csum which is not true for UFO fragments.
      
      When checking against MTU, we need to add skb->len to length of new segment
      if we already have a partially filled skb and fragheaderlen only if there
      isn't one.
      
      In the IPv6 case, skb can only be null if this is the first segment so that
      we have to use headersize (length of the first IPv6 header) rather than
      fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
      
      Fixes: e89e9cf5 ("[IPv4/IPv6]: UFO Scatter-gather approach")
      Fixes: e4c5e13a ("ipv6: Should use consistent conditional judgement for
      	ip6 fragment between __ip6_append_data and ip6_finish_output")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5cb659b
  17. 23 6月, 2017 1 次提交
    • P
      udp: fix poll() · 9bd780f5
      Paolo Abeni 提交于
      Michael reported an UDP breakage caused by the commit b65ac446
      ("udp: try to avoid 2 cache miss on dequeue").
      The function __first_packet_length() can update the checksum bits
      of the pending skb, making the scratched area out-of-sync, and
      setting skb->csum, if the skb was previously in need of checksum
      validation.
      
      On later recvmsg() for such skb, checksum validation will be
      invoked again - due to the wrong udp_skb_csum_unnecessary()
      value - and will fail, causing the valid skb to be dropped.
      
      This change addresses the issue refreshing the scratch area in
      __first_packet_length() after the possible checksum update.
      
      Fixes: b65ac446 ("udp: try to avoid 2 cache miss on dequeue")
      Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bd780f5
  18. 22 6月, 2017 1 次提交
  19. 21 6月, 2017 1 次提交