1. 17 9月, 2016 1 次提交
  2. 11 9月, 2016 6 次提交
  3. 10 9月, 2016 1 次提交
    • G
      ipv6: report NLM_F_CREATE and NLM_F_EXCL flags in RTM_NEWROUTE events · 73483c12
      Guillaume Nault 提交于
      Since commit 37a1d361 ("ipv6: include NLM_F_REPLACE in route
      replace notifications"), RTM_NEWROUTE notifications have their
      NLM_F_REPLACE flag set if the new route replaced a preexisting one.
      However, other flags aren't set.
      
      This patch reports the missing NLM_F_CREATE and NLM_F_EXCL flag bits.
      
      NLM_F_APPEND is not reported, because in ipv6 a NLM_F_CREATE request
      is interpreted as an append request (contrary to ipv4, "prepend" is not
      supported, so if NLM_F_EXCL is not set then NLM_F_APPEND is implicit).
      
      As a result, the possible flag combination can now be reported
      (iproute2's terminology into parentheses):
      
        * NLM_F_CREATE | NLM_F_EXCL: route didn't exist, exclusive creation
          ("add").
        * NLM_F_CREATE: route did already exist, new route added after
          preexisting ones ("append").
        * NLM_F_REPLACE: route did already exist, new route replaced the
          first preexisting one ("change").
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73483c12
  4. 07 9月, 2016 2 次提交
    • W
      ipv6: addrconf: fix dev refcont leak when DAD failed · 751eb6b6
      Wei Yongjun 提交于
      In general, when DAD detected IPv6 duplicate address, ifp->state
      will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
      delayed work, the call tree should be like this:
      
      ndisc_recv_ns
        -> addrconf_dad_failure        <- missing ifp put
           -> addrconf_mod_dad_work
             -> schedule addrconf_dad_work()
               -> addrconf_dad_stop()  <- missing ifp hold before call it
      
      addrconf_dad_failure() called with ifp refcont holding but not put.
      addrconf_dad_work() call addrconf_dad_stop() without extra holding
      refcount. This will not cause any issue normally.
      
      But the race between addrconf_dad_failure() and addrconf_dad_work()
      may cause ifp refcount leak and netdevice can not be unregister,
      dmesg show the following messages:
      
      IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
      ...
      unregister_netdevice: waiting for eth0 to become free. Usage count = 1
      
      Cc: stable@vger.kernel.org
      Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
      to workqueue")
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      751eb6b6
    • D
      ipv6: release dst in ping_v6_sendmsg · 03c2778a
      Dave Jones 提交于
      Neither the failure or success paths of ping_v6_sendmsg release
      the dst it acquires.  This leads to a flood of warnings from
      "net/core/dst.c:288 dst_release" on older kernels that
      don't have 8bf4ada2 backported.
      
      That patch optimistically hoped this had been fixed post 3.10, but
      it seems at least one case wasn't, where I've seen this triggered
      a lot from machines doing unprivileged icmp sockets.
      
      Cc: Martin Lau <kafai@fb.com>
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03c2778a
  5. 02 9月, 2016 4 次提交
  6. 31 8月, 2016 1 次提交
    • R
      net: lwtunnel: Handle fragmentation · 14972cbd
      Roopa Prabhu 提交于
      Today mpls iptunnel lwtunnel_output redirect expects the tunnel
      output function to handle fragmentation. This is ok but can be
      avoided if we did not do the mpls output redirect too early.
      ie we could wait until ip fragmentation is done and then call
      mpls output for each ip fragment.
      
      To make this work we will need,
      1) the lwtunnel state to carry encap headroom
      2) and do the redirect to the encap output handler on the ip fragment
      (essentially do the output redirect after fragmentation)
      
      This patch adds tunnel headroom in lwtstate to make sure we
      account for tunnel data in mtu calculations during fragmentation
      and adds new xmit redirect handler to redirect to lwtunnel xmit func
      after ip fragmentation.
      
      This includes IPV6 and some mtu fixes and testing from David Ahern.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14972cbd
  7. 30 8月, 2016 1 次提交
  8. 29 8月, 2016 2 次提交
    • E
      tcp: add tcp_add_backlog() · c9c33212
      Eric Dumazet 提交于
      When TCP operates in lossy environments (between 1 and 10 % packet
      losses), many SACK blocks can be exchanged, and I noticed we could
      drop them on busy senders, if these SACK blocks have to be queued
      into the socket backlog.
      
      While the main cause is the poor performance of RACK/SACK processing,
      we can try to avoid these drops of valuable information that can lead to
      spurious timeouts and retransmits.
      
      Cause of the drops is the skb->truesize overestimation caused by :
      
      - drivers allocating ~2048 (or more) bytes as a fragment to hold an
        Ethernet frame.
      
      - various pskb_may_pull() calls bringing the headers into skb->head
        might have pulled all the frame content, but skb->truesize could
        not be lowered, as the stack has no idea of each fragment truesize.
      
      The backlog drops are also more visible on bidirectional flows, since
      their sk_rmem_alloc can be quite big.
      
      Let's add some room for the backlog, as only the socket owner
      can selectively take action to lower memory needs, like collapsing
      receive queues or partial ofo pruning.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9c33212
    • T
      tcp: Set read_sock and peek_len proto_ops · 32035585
      Tom Herbert 提交于
      In inet_stream_ops we set read_sock to tcp_read_sock and peek_len to
      tcp_peek_len (which is just a stub function that calls tcp_inq).
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32035585
  9. 26 8月, 2016 2 次提交
  10. 25 8月, 2016 1 次提交
    • L
      netfilter: nft_reject: restrict to INPUT/FORWARD/OUTPUT · 89e1f6d2
      Liping Zhang 提交于
      After I add the nft rule "nft add rule filter prerouting reject
      with tcp reset", kernel panic happened on my system:
        NULL pointer dereference at ...
        IP: [<ffffffff81b9db2f>] nf_send_reset+0xaf/0x400
        Call Trace:
        [<ffffffff81b9da80>] ? nf_reject_ip_tcphdr_get+0x160/0x160
        [<ffffffffa0928061>] nft_reject_ipv4_eval+0x61/0xb0 [nft_reject_ipv4]
        [<ffffffffa08e836a>] nft_do_chain+0x1fa/0x890 [nf_tables]
        [<ffffffffa08e8170>] ? __nft_trace_packet+0x170/0x170 [nf_tables]
        [<ffffffffa06e0900>] ? nf_ct_invert_tuple+0xb0/0xc0 [nf_conntrack]
        [<ffffffffa07224d4>] ? nf_nat_setup_info+0x5d4/0x650 [nf_nat]
        [...]
      
      Because in the PREROUTING chain, routing information is not exist,
      then we will dereference the NULL pointer and oops happen.
      
      So we restrict reject expression to INPUT, FORWARD and OUTPUT chain.
      This is consistent with iptables REJECT target.
      Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      89e1f6d2
  11. 24 8月, 2016 6 次提交
  12. 23 8月, 2016 1 次提交
    • M
      net: ipv6: Remove addresses for failures with strict DAD · 85b51b12
      Mike Manning 提交于
      If DAD fails with accept_dad set to 2, global addresses and host routes
      are incorrectly left in place. Even though disable_ipv6 is set,
      contrary to documentation, the addresses are not dynamically deleted
      from the interface. It is only on a subsequent link down/up that these
      are removed. The fix is not only to set the disable_ipv6 flag, but
      also to call addrconf_ifdown(), which is the action to carry out when
      disabling IPv6. This results in the addresses and routes being deleted
      immediately. The DAD failure for the LL addr is determined as before
      via netlink, or by the absence of the LL addr (which also previously
      would have had to be checked for in case of an intervening link down
      and up). As the call to addrconf_ifdown() requires an rtnl lock, the
      logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().
      
      Previous behavior:
      
      root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
      net.ipv6.conf.eth3.accept_dad = 2
      root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
          inet6 2000::10/64 scope global
             valid_lft forever preferred_lft forever
          inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
             valid_lft forever preferred_lft forever
      root@vm1:/# ip -6 route show dev eth3
      2000::/64  proto kernel  metric 256
      fe80::/64  proto kernel  metric 256
      root@vm1:/# ip link set down eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      root@vm1:/# ip -6 route show dev eth3
      root@vm1:/#
      
      New behavior:
      
      root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
      net.ipv6.conf.eth3.accept_dad = 2
      root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      root@vm1:/# ip -6 route show dev eth3
      root@vm1:/#
      Signed-off-by: NMike Manning <mmanning@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85b51b12
  13. 22 8月, 2016 1 次提交
  14. 18 8月, 2016 1 次提交
  15. 16 8月, 2016 2 次提交
  16. 14 8月, 2016 2 次提交
  17. 13 8月, 2016 1 次提交
  18. 11 8月, 2016 2 次提交
  19. 09 8月, 2016 1 次提交
  20. 31 7月, 2016 1 次提交
  21. 27 7月, 2016 1 次提交