1. 04 7月, 2017 1 次提交
  2. 05 6月, 2017 1 次提交
  3. 01 2月, 2017 1 次提交
    • D
      ipv6: fix flow labels when the traffic class is non-0 · 90427ef5
      Dimitris Michailidis 提交于
      ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
      supposed to be passed a flow label, which it returns as is if non-0 and
      in some other cases, otherwise it calculates a new value.
      
      The problem is callers often pass a flowi6.flowlabel, which may also
      contain traffic class bits. If the traffic class is non-0
      ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
      returns the whole thing. Thus it can return a 'flow label' longer than
      20b and the low 20b of that is typically 0 resulting in packets with 0
      label. Moreover, different packets of a flow may be labeled differently.
      For a TCP flow with ECN non-payload and payload packets get different
      labels as exemplified by this pair of consecutive packets:
      
      (pure ACK)
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
          .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
          Payload Length: 32
          Next Header: TCP (6)
      
      (payload)
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
          .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
          Payload Length: 688
          Next Header: TCP (6)
      
      This patch allows ip6_make_flowlabel() to be passed more than just a
      flow label and has it extract the part it really wants. This was simpler
      than modifying the callers. With this patch packets like the above become
      
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
          .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
          Payload Length: 32
          Next Header: TCP (6)
      
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
          .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
          Payload Length: 688
          Next Header: TCP (6)
      Signed-off-by: NDimitris Michailidis <dmichail@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90427ef5
  4. 27 1月, 2017 1 次提交
  5. 01 12月, 2016 1 次提交
  6. 10 11月, 2016 1 次提交
  7. 28 6月, 2016 3 次提交
  8. 04 5月, 2016 1 次提交
    • W
      ipv6: add new struct ipcm6_cookie · 26879da5
      Wei Wang 提交于
      In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
      variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
      functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
      This is not a good practice and makes it hard to add new parameters.
      This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
      ipv4 and include the above mentioned variables. And we only pass the
      pointer to this structure to corresponding functions. This makes it easier
      to add new parameters in the future and makes the function cleaner.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26879da5
  9. 28 4月, 2016 5 次提交
  10. 15 4月, 2016 2 次提交
    • M
      ipv6: udp: Do a route lookup and update during release_cb · e646b657
      Martin KaFai Lau 提交于
      This patch adds a release_cb for UDPv6.  It does a route lookup
      and updates sk->sk_dst_cache if it is needed.  It picks up the
      left-over job from ip6_sk_update_pmtu() if the sk was owned
      by user during the pmtu update.
      
      It takes a rcu_read_lock to protect the __sk_dst_get() operations
      because another thread may do ip6_dst_store() without taking the
      sk lock (e.g. sendmsg).
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reported-by: NWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e646b657
    • M
      ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update · 33c162a9
      Martin KaFai Lau 提交于
      There is a case in connected UDP socket such that
      getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
      sequence could be the following:
      1. Create a connected UDP socket
      2. Send some datagrams out
      3. Receive a ICMPV6_PKT_TOOBIG
      4. No new outgoing datagrams to trigger the sk_dst_check()
         logic to update the sk->sk_dst_cache.
      5. getsockopt(IPV6_MTU) returns the mtu from the invalid
         sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
      
      This patch updates the sk->sk_dst_cache for a connected datagram sk
      during pmtu-update code path.
      
      Note that the sk->sk_v6_daddr is used to do the route lookup
      instead of skb->data (i.e. iph).  It is because a UDP socket can become
      connected after sending out some datagrams in un-connected state.  or
      It can be connected multiple times to different destinations.  Hence,
      iph may not be related to where sk is currently connected to.
      
      It is done under '!sock_owned_by_user(sk)' condition because
      the user may make another ip6_datagram_connect()  (i.e changing
      the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
      code path.
      
      For the sock_owned_by_user(sk) == true case, the next patch will
      introduce a release_cb() which will update the sk->sk_dst_cache.
      
      Test:
      
      Server (Connected UDP Socket):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Route Details:
      [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
      2fac::/64 dev eth0  proto kernel  metric 256  pref medium
      2fac:face::/64 via 2fac::face dev eth0  metric 1024  pref medium
      
      A simple python code to create a connected UDP socket:
      
      import socket
      import errno
      
      HOST = '2fac::1'
      PORT = 8080
      
      s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
      s.bind((HOST, PORT))
      s.connect(('2fac:face::face', 53))
      print("connected")
      while True:
          try:
      	data = s.recv(1024)
          except socket.error as se:
      	if se.errno == errno.EMSGSIZE:
      		pmtu = s.getsockopt(41, 24)
      		print("PMTU:%d" % pmtu)
      		break
      s.close()
      
      Python program output after getting a ICMPV6_PKT_TOOBIG:
      [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
      connected
      PMTU:1300
      
      Cache routes after recieving TOOBIG:
      [root@arch-fb-vm1 ~]# ip -6 r show table cache
      2fac:face::face via 2fac::face dev eth0  metric 0
          cache  expires 463sec mtu 1300 pref medium
      
      Client (Send the ICMPV6_PKT_TOOBIG):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      scapy is used to generate the TOOBIG message.  Here is the scapy script I have
      used:
      
      >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
      1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
      >>> sendp(p, iface='qemubr0')
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reported-by: NWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33c162a9
  11. 05 4月, 2016 1 次提交
    • S
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh 提交于
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ac945
  12. 21 3月, 2016 1 次提交
  13. 19 2月, 2016 1 次提交
  14. 10 12月, 2015 1 次提交
  15. 03 12月, 2015 1 次提交
  16. 25 11月, 2015 1 次提交
  17. 08 10月, 2015 6 次提交
  18. 26 9月, 2015 2 次提交
  19. 18 9月, 2015 1 次提交
    • E
      netfilter: Pass net into okfn · 0c4b51f0
      Eric W. Biederman 提交于
      This is immediately motivated by the bridge code that chains functions that
      call into netfilter.  Without passing net into the okfns the bridge code would
      need to guess about the best expression for the network namespace to process
      packets in.
      
      As net is frequently one of the first things computed in continuation functions
      after netfilter has done it's job passing in the desired network namespace is in
      many cases a code simplification.
      
      To support this change the function dst_output_okfn is introduced to
      simplify passing dst_output as an okfn.  For the moment dst_output_okfn
      just silently drops the struct net.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c4b51f0
  20. 01 8月, 2015 4 次提交
  21. 30 7月, 2015 1 次提交
  22. 05 6月, 2015 2 次提交
    • T
      net: Add full IPv6 addresses to flow_keys · c3f83241
      Tom Herbert 提交于
      This patch adds full IPv6 addresses into flow_keys and uses them as
      input to the flow hash function. The implementation supports either
      IPv4 or IPv6 addresses in a union, and selector is used to determine
      how may words to input to jhash2.
      
      We also add flow_get_u32_dst and flow_get_u32_src functions which are
      used to get a u32 representation of the source and destination
      addresses. For IPv6, ipv6_addr_hash is called. These functions retain
      getting the legacy values of src and dst in flow_keys.
      
      With this patch, Ethertype and IP protocol are now included in the
      flow hash input.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3f83241
    • T
      net: Get skb hash over flow_keys structure · 42aecaa9
      Tom Herbert 提交于
      This patch changes flow hashing to use jhash2 over the flow_keys
      structure instead just doing jhash_3words over src, dst, and ports.
      This method will allow us take more input into the hashing function
      so that we can include full IPv6 addresses, VLAN, flow labels etc.
      without needing to resort to xor'ing which makes for a poor hash.
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42aecaa9
  23. 26 5月, 2015 1 次提交