1. 18 4月, 2018 1 次提交
    • D
      net: Move fib_convert_metrics to metrics file · a919525a
      David Ahern 提交于
      Move logic of fib_convert_metrics into ip_metrics_convert. This allows
      the code that converts netlink attributes into metrics struct to be
      re-used in a later patch by IPv6.
      
      This is mostly a code move with the following changes to variable names:
        - fi->fib_net becomes net
        - fc_mx and fc_mx_len are passed as inputs pulled from fib_config
        - metrics array is passed as an input from fi->fib_metrics->metrics
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a919525a
  2. 01 4月, 2018 1 次提交
  3. 23 3月, 2018 1 次提交
  4. 15 3月, 2018 1 次提交
    • S
      ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu · d52e5a7e
      Sabrina Dubroca 提交于
      Prior to the rework of PMTU information storage in commit
      2c8cec5c ("ipv4: Cache learned PMTU information in inetpeer."),
      when a PMTU event advertising a PMTU smaller than
      net.ipv4.route.min_pmtu was received, we would disable setting the DF
      flag on packets by locking the MTU metric, and set the PMTU to
      net.ipv4.route.min_pmtu.
      
      Since then, we don't disable DF, and set PMTU to
      net.ipv4.route.min_pmtu, so the intermediate router that has this link
      with a small MTU will have to drop the packets.
      
      This patch reestablishes pre-2.6.39 behavior by splitting
      rtable->rt_pmtu into a bitfield with rt_mtu_locked and rt_pmtu.
      rt_mtu_locked indicates that we shouldn't set the DF bit on that path,
      and is checked in ip_dont_fragment().
      
      One possible workaround is to set net.ipv4.route.min_pmtu to a value low
      enough to accommodate the lowest MTU encountered.
      
      Fixes: 2c8cec5c ("ipv4: Cache learned PMTU information in inetpeer.")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d52e5a7e
  5. 01 3月, 2018 1 次提交
  6. 14 12月, 2017 1 次提交
  7. 03 12月, 2017 1 次提交
  8. 17 8月, 2017 1 次提交
    • E
      ipv4: better IP_MAX_MTU enforcement · c780a049
      Eric Dumazet 提交于
      While working on yet another syzkaller report, I found
      that our IP_MAX_MTU enforcements were not properly done.
      
      gcc seems to reload dev->mtu for min(dev->mtu, IP_MAX_MTU), and
      final result can be bigger than IP_MAX_MTU :/
      
      This is a problem because device mtu can be changed on other cpus or
      threads.
      
      While this patch does not fix the issue I am working on, it is
      probably worth addressing it.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c780a049
  9. 08 8月, 2017 1 次提交
  10. 07 8月, 2017 1 次提交
  11. 14 4月, 2017 1 次提交
    • G
      net: ipv4: Refine the ipv4_default_advmss · 7ed14d97
      Gao Feng 提交于
      1. Don't get the metric RTAX_ADVMSS of dst.
      There are two reasons.
      1) Its caller dst_metric_advmss has already invoke dst_metric_advmss
      before invoke default_advmss.
      2) The ipv4_default_advmss is used to get the default mss, it should
      not try to get the metric like ip6_default_advmss.
      
      2. Use sizeof(tcphdr)+sizeof(iphdr) instead of literal 40.
      
      3. Define one new macro IPV4_MAX_PMTU instead of 65535 according to
      RFC 2675, section 5.1.
      Signed-off-by: NGao Feng <fgao@ikuai8.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ed14d97
  12. 25 1月, 2017 1 次提交
    • K
      Introduce a sysctl that modifies the value of PROT_SOCK. · 4548b683
      Krister Johansen 提交于
      Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
      that denotes the first unprivileged inet port in the namespace.  To
      disable all privileged ports set this to zero.  It also checks for
      overlap with the local port range.  The privileged and local range may
      not overlap.
      
      The use case for this change is to allow containerized processes to bind
      to priviliged ports, but prevent them from ever being allowed to modify
      their container's network configuration.  The latter is accomplished by
      ensuring that the network namespace is not a child of the user
      namespace.  This modification was needed to allow the container manager
      to disable a namespace's priviliged port restrictions without exposing
      control of the network namespace to processes in the user namespace.
      Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4548b683
  13. 08 11月, 2016 1 次提交
  14. 05 11月, 2016 1 次提交
    • L
      net: inet: Support UID-based routing in IP protocols. · e2d118a1
      Lorenzo Colitti 提交于
      - Use the UID in routing lookups made by protocol connect() and
        sendmsg() functions.
      - Make sure that routing lookups triggered by incoming packets
        (e.g., Path MTU discovery) take the UID of the socket into
        account.
      - For packets not associated with a userspace socket, (e.g., ping
        replies) use UID 0 inside the user namespace corresponding to
        the network namespace the socket belongs to. This allows
        all namespaces to apply routing and iptables rules to
        kernel-originated traffic in that namespaces by matching UID 0.
        This is better than using the UID of the kernel socket that is
        sending the traffic, because the UID of kernel sockets created
        at namespace creation time (e.g., the per-processor ICMP and
        TCP sockets) is the UID of the user that created the socket,
        which might not be mapped in the namespace.
      
      Tested: compiles allnoconfig, allyesconfig, allmodconfig
      Tested: https://android-review.googlesource.com/253302Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2d118a1
  15. 04 11月, 2016 1 次提交
  16. 27 10月, 2016 1 次提交
    • E
      udp: fix IP_CHECKSUM handling · 10df8e61
      Eric Dumazet 提交于
      First bug was added in commit ad6f939a ("ip: Add offset parameter to
      ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
      AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
      ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));
      
      Then commit e6afc8ac ("udp: remove headers from UDP packets before
      queueing") forgot to adjust the offsets now UDP headers are pulled
      before skb are put in receive queue.
      
      Fixes: ad6f939a ("ip: Add offset parameter to ip_cmsg_recv")
      Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Sam Kumar <samanthakumar@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Tested-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10df8e61
  17. 17 10月, 2016 1 次提交
    • D
      net: Require exact match for TCP socket lookups if dif is l3mdev · a04a480d
      David Ahern 提交于
      Currently, socket lookups for l3mdev (vrf) use cases can match a socket
      that is bound to a port but not a device (ie., a global socket). If the
      sysctl tcp_l3mdev_accept is not set this leads to ack packets going out
      based on the main table even though the packet came in from an L3 domain.
      The end result is that the connection does not establish creating
      confusion for users since the service is running and a socket shows in
      ss output. Fix by requiring an exact dif to sk_bound_dev_if match if the
      skb came through an interface enslaved to an l3mdev device and the
      tcp_l3mdev_accept is not set.
      
      skb's through an l3mdev interface are marked by setting a flag in
      inet{6}_skb_parm. The IPv6 variant is already set; this patch adds the
      flag for IPv4. Using an skb flag avoids a device lookup on the dif. The
      flag is set in the VRF driver using the IP{6}CB macros. For IPv4, the
      inet_skb_parm struct is moved in the cb per commit 971f10ec, so the
      match function in the TCP stack needs to use TCP_SKB_CB. For IPv6, the
      move is done after the socket lookup, so IP6CB is used.
      
      The flags field in inet_skb_parm struct needs to be increased to add
      another flag. There is currently a 1-byte hole following the flags,
      so it can be expanded to u16 without increasing the size of the struct.
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a04a480d
  18. 30 9月, 2016 1 次提交
  19. 20 7月, 2016 1 次提交
  20. 30 6月, 2016 1 次提交
  21. 12 5月, 2016 1 次提交
    • D
      net: original ingress device index in PKTINFO · 0b922b7a
      David Ahern 提交于
      Applications such as OSPF and BFD need the original ingress device not
      the VRF device; the latter can be derived from the former. To that end
      add the skb_iif to inet_skb_parm and set it in ipv4 code after clearing
      the skb control buffer similar to IPv6. From there the pktinfo can just
      pull it from cb with the PKTINFO_SKB_CB cast.
      
      The previous patch moving the skb->dev change to L3 means nothing else
      is needed for IPv6; it just works.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b922b7a
  22. 28 4月, 2016 6 次提交
  23. 05 4月, 2016 1 次提交
  24. 02 3月, 2016 1 次提交
  25. 17 2月, 2016 2 次提交
  26. 13 10月, 2015 1 次提交
  27. 08 10月, 2015 6 次提交
  28. 05 10月, 2015 1 次提交
  29. 30 9月, 2015 1 次提交