1. 07 3月, 2014 4 次提交
  2. 05 3月, 2014 2 次提交
  3. 04 3月, 2014 6 次提交
    • D
      net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable · ec0223ec
      Daniel Borkmann 提交于
      RFC4895 introduced AUTH chunks for SCTP; during the SCTP
      handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
      being optional though):
      
        ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
        <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
      
      A special case is when an endpoint requires COOKIE-ECHO
      chunks to be authenticated:
      
        ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
        <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
        ------------------ AUTH; COOKIE-ECHO ---------------->
        <-------------------- COOKIE-ACK ---------------------
      
      RFC4895, section 6.3. Receiving Authenticated Chunks says:
      
        The receiver MUST use the HMAC algorithm indicated in
        the HMAC Identifier field. If this algorithm was not
        specified by the receiver in the HMAC-ALGO parameter in
        the INIT or INIT-ACK chunk during association setup, the
        AUTH chunk and all the chunks after it MUST be discarded
        and an ERROR chunk SHOULD be sent with the error cause
        defined in Section 4.1. [...] If no endpoint pair shared
        key has been configured for that Shared Key Identifier,
        all authenticated chunks MUST be silently discarded. [...]
      
        When an endpoint requires COOKIE-ECHO chunks to be
        authenticated, some special procedures have to be followed
        because the reception of a COOKIE-ECHO chunk might result
        in the creation of an SCTP association. If a packet arrives
        containing an AUTH chunk as a first chunk, a COOKIE-ECHO
        chunk as the second chunk, and possibly more chunks after
        them, and the receiver does not have an STCB for that
        packet, then authentication is based on the contents of
        the COOKIE-ECHO chunk. In this situation, the receiver MUST
        authenticate the chunks in the packet by using the RANDOM
        parameters, CHUNKS parameters and HMAC_ALGO parameters
        obtained from the COOKIE-ECHO chunk, and possibly a local
        shared secret as inputs to the authentication procedure
        specified in Section 6.3. If authentication fails, then
        the packet is discarded. If the authentication is successful,
        the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
        MUST be processed. If the receiver has an STCB, it MUST
        process the AUTH chunk as described above using the STCB
        from the existing association to authenticate the
        COOKIE-ECHO chunk and all the chunks after it. [...]
      
      Commit bbd0d598 introduced the possibility to receive
      and verification of AUTH chunk, including the edge case for
      authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
      the function sctp_sf_do_5_1D_ce() handles processing,
      unpacks and creates a new association if it passed sanity
      checks and also tests for authentication chunks being
      present. After a new association has been processed, it
      invokes sctp_process_init() on the new association and
      walks through the parameter list it received from the INIT
      chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
      and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
      meta data (peer_random, peer_hmacs, peer_chunks) in case
      sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
      SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
      peer_random != NULL and peer_hmacs != NULL the peer is to be
      assumed asoc->peer.auth_capable=1, in any other case
      asoc->peer.auth_capable=0.
      
      Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
      available, we set up a fake auth chunk and pass that on to
      sctp_sf_authenticate(), which at latest in
      sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
      at position 0..0008 when setting up the crypto key in
      crypto_hash_setkey() by using asoc->asoc_shared_key that is
      NULL as condition key_id == asoc->active_key_id is true if
      the AUTH chunk was injected correctly from remote. This
      happens no matter what net.sctp.auth_enable sysctl says.
      
      The fix is to check for net->sctp.auth_enable and for
      asoc->peer.auth_capable before doing any operations like
      sctp_sf_authenticate() as no key is activated in
      sctp_auth_asoc_init_active_key() for each case.
      
      Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
      passed from the INIT chunk was not used in the AUTH chunk, we
      SHOULD send an error; however in this case it would be better
      to just silently discard such a maliciously prepared handshake
      as we didn't even receive a parameter at all. Also, as our
      endpoint has no shared key configured, section 6.3 says that
      MUST silently discard, which we are doing from now onwards.
      
      Before calling sctp_sf_pdiscard(), we need not only to free
      the association, but also the chunk->auth_chunk skb, as
      commit bbd0d598 created a skb clone in that case.
      
      I have tested this locally by using netfilter's nfqueue and
      re-injecting packets into the local stack after maliciously
      modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
      and the SCTP packet containing the COOKIE_ECHO (injecting
      AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.
      
      Fixes: bbd0d598 ("[SCTP]: Implement the receive and verification of AUTH chunk")
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <yasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0223ec
    • Y
      tcp: snmp stats for Fast Open, SYN rtx, and data pkts · f19c29e3
      Yuchung Cheng 提交于
      Add the following snmp stats:
      
      TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed beacuse
      the remote does not accept it or the attempts timed out.
      
      TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
      retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
      
      TCPOrigDataSent: number of outgoing packets with original data (excluding
      retransmission but including data-in-SYN). This counter is different from
      TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is
      more useful to track the TCP retransmission rate.
      
      Change TCPFastOpenActive to track only successful Fast Opens to be symmetric to
      TCPFastOpenPassive.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NLawrence Brakmo <brakmo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f19c29e3
    • X
      ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer · 10ddceb2
      Xin Long 提交于
      when ip_tunnel process multicast packets, it may check if the packet is looped
      back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(),
      but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(),
      so which leads to a panic.
      
      fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10ddceb2
    • H
      sch_tbf: Remove holes in struct tbf_sched_data. · a135e598
      Hiroaki SHIMODA 提交于
      On x86_64 we have 3 holes in struct tbf_sched_data.
      
      The member peak_present can be replaced with peak.rate_bytes_ps,
      because peak.rate_bytes_ps is set only when peak is specified in
      tbf_change(). tbf_peak_present() is introduced to test
      peak.rate_bytes_ps.
      
      The member max_size is moved to fill 32bit hole.
      Signed-off-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a135e598
    • Y
      tcp: fix bogus RTT on special retransmission · c84a5711
      Yuchung Cheng 提交于
      RTT may be bogus with tall loss probe (TLP) when a packet
      is retransmitted and latter (s)acked without TCPCB_SACKED_RETRANS flag.
      
      For example, TLP calls __tcp_retransmit_skb() instead of
      tcp_retransmit_skb(). The skb timestamps are updated but the sacked
      flag is not marked with TCPCB_SACKED_RETRANS. As a result we'll
      get bogus RTT in tcp_clean_rtx_queue() or in tcp_sacktag_one() on
      spurious retransmission.
      
      The fix is to apply the sticky flag TCP_EVER_RETRANS to enforce Karn's
      check on RTT sampling. However this will disable F-RTO if timeout occurs
      after TLP, by resetting undo_marker in tcp_enter_loss(). We relax this
      check to only if any pending retransmists are still in-flight.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c84a5711
    • D
      hsr: off by one sanity check in hsr_register_frame_in() · de39d7a4
      Dan Carpenter 提交于
      This is a sanity check and we never pass invalid values so this patch
      doesn't change anything.  However the node->time_in[] array has
      HSR_MAX_SLAVE (2) elements and not HSR_MAX_DEV (3).
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de39d7a4
  4. 03 3月, 2014 4 次提交
  5. 01 3月, 2014 7 次提交
  6. 28 2月, 2014 6 次提交
  7. 27 2月, 2014 9 次提交
    • E
      tcp: switch rtt estimations to usec resolution · 740b0f18
      Eric Dumazet 提交于
      Upcoming congestion controls for TCP require usec resolution for RTT
      estimations. Millisecond resolution is simply not enough these days.
      
      FQ/pacing in DC environments also require this change for finer control
      and removal of bimodal behavior due to the current hack in
      tcp_update_pacing_rate() for 'small rtt'
      
      TCP_CONG_RTT_STAMP is no longer needed.
      
      As Julian Anastasov pointed out, we need to keep user compatibility :
      tcp_metrics used to export RTT and RTTVAR in msec resolution,
      so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
      to use the new attributes if provided by the kernel.
      
      In this example ss command displays a srtt of 32 usecs (10Gbit link)
      
      lpk51:~# ./ss -i dst lpk52
      Netid  State      Recv-Q Send-Q   Local Address:Port       Peer
      Address:Port
      tcp    ESTAB      0      1         10.246.11.51:42959
      10.246.11.52:64614
               cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
      cwnd:10 send
      3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559
      
      Updated iproute2 ip command displays :
      
      lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
      10.246.11.51
      
      Old binary displays :
      
      lpk51:~# ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
      10.246.11.51
      
      With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Larry Brakmo <brakmo@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      740b0f18
    • H
      ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMIT · 0b95227a
      Hannes Frederic Sowa 提交于
      This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which
      got recently introduced. It doesn't honor the path mtu discovered by the
      host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of
      fragments if the packet size exceeds the MTU of the outgoing interface
      MTU.
      
      Fixes: 93b36cf3 ("ipv6: support IPV6_PMTU_INTERFACE on sockets")
      Cc: Florian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b95227a
    • H
      ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT · 1b346576
      Hannes Frederic Sowa 提交于
      IP_PMTUDISC_INTERFACE has a design error: because it does not allow the
      generation of fragments if the interface mtu is exceeded, it is very
      hard to make use of this option in already deployed name server software
      for which I introduced this option.
      
      This patch adds yet another new IP_MTU_DISCOVER option to not honor any
      path mtu information and not accepting new icmp notifications destined for
      the socket this option is enabled on. But we allow outgoing fragmentation
      in case the packet size exceeds the outgoing interface mtu.
      
      As such this new option can be used as a drop-in replacement for
      IP_PMTUDISC_DONT, which is currently in use by most name server software
      making the adoption of this option very smooth and easy.
      
      The original advantage of IP_PMTUDISC_INTERFACE is still maintained:
      ignoring incoming path MTU updates and not honoring discovered path MTUs
      in the output path.
      
      Fixes: 482fc609 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE")
      Cc: Florian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b346576
    • H
      ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragment · 69647ce4
      Hannes Frederic Sowa 提交于
      ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket
      is attached to the skb (in case of forwarding) or determines the mtu like
      we do in ip_finish_output, which actually checks if we should branch to
      ip_fragment. Thus use the same function to determine the mtu here, too.
      
      This is important for the introduction of IP_PMTUDISC_OMIT, where we
      want the packets getting cut in pieces of the size of the outgoing
      interface mtu. IPv6 already does this correctly.
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69647ce4
    • T
      neigh: probe application via netlink in NUD_PROBE · a960ff81
      Timo Teräs 提交于
      iproute2 arpd seems to expect this as there's code and comments
      to handle netlink probes with NUD_PROBE set. It is used to flush
      the arpd cached mappings.
      
      opennhrp instead turns off unicast probes (so it can handle all
      neighbour discovery). Without this change it will not see NUD_PROBE
      probes and cannot reconfirm the mapping. Thus currently neigh entry
      will just fail and can cause few packets dropped until broadcast
      discovery is restarted.
      
      Earlier discussion on the subject:
      http://marc.info/?t=139305877100001&r=1&w=2Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a960ff81
    • B
      ipv6: log src and dst along with "udp checksum is 0" · 84a3e72c
      Bjørn Mork 提交于
      These info messages are rather pointless without any means to identify
      the source of the bogus packets.  Logging the src and dst addresses and
      ports may help a bit.
      
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84a3e72c
    • A
      net: Add sysfs file for port number · 3f85944f
      Amir Vadai 提交于
      Add a sysfs file to enable user space to query the device
      port number used by a netdevice instance. This is needed for
      devices that have multiple ports on the same PCI function.
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f85944f
    • F
      net: tcp: add mib counters to track zero window transitions · 8e165e20
      Florian Westphal 提交于
      Three counters are added:
      - one to track when we went from non-zero to zero window
      - one to track the reverse
      - one counter incremented when we want to announce zero window,
        but can't because we would shrink current window.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e165e20
    • E
      net: tcp: use NET_INC_STATS() · 9a9bfd03
      Eric Dumazet 提交于
      While LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES can only be incremented
      in tcp_transmit_skb() from softirq (incoming message or timer
      activation), it is better to use NET_INC_STATS() instead of
      NET_INC_STATS_BH() as tcp_transmit_skb() can be called from process
      context.
      
      This will avoid copy/paste confusion when/if we want to add
      other SNMP counters in tcp_transmit_skb()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a9bfd03
  8. 26 2月, 2014 2 次提交