- 11 8月, 2015 3 次提交
-
-
由 David Ahern 提交于
Rules can be installed that direct route lookups to specific tables based on oif. Plumb the oif through the xfrm lookups so it gets set in the flow struct and passed to the resolver routines. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Thomas Egerer 提交于
Using ipv6_get_dsfield on the outer IP header implies that inner and outer header are of the the same address family. For interfamily tunnels, particularly 646, the code reading the DSCP field obtains the wrong values (IHL and the upper four bits of the DSCP field). This can cause the code to detect a congestion encoutered state in the outer header and enable the corresponding bits in the inner header, too. Since the DSCP field is stored in the xfrm mode common buffer independently from the IP version of the outer header, it's safe (and correct) to take this value from there. Signed-off-by: NThomas Egerer <thomas.egerer@secunet.com> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Pravin B Shelar 提交于
Following patch create new tunnel flag which enable tunnel metadata collection on given device. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2015 1 次提交
-
-
Make it similar to reject_tg() in ipt_REJECT. Suggested-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 01 8月, 2015 4 次提交
-
-
由 Tom Herbert 提交于
Per RFC6437 stateful flow labels (e.g. labels set by flow label manager) cannot "disturb" nodes taking part in stateless flow labels. While the ranges only reduce the flow label entropy by one bit, it is conceivable that this might bias the algorithm on some routers causing a load imbalance. For best results on the Internet we really need the full 20 bits. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Change the meaning of net.ipv6.auto_flowlabels to provide a mode for automatic flow labels generation. There are four modes: 0: flow labels are disabled 1: flow labels are enabled, sockets can opt-out 2: flow labels are allowed, sockets can opt-in 3: flow labels are enabled and enforced, no opt-out for sockets np->autoflowlabel is initialized according to the sysctl value. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
We can't call skb_get_hash here since the packet is not complete to do flow_dissector. Create hash based on flowi6 instead. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roopa Prabhu 提交于
This patch adds net argument to ipv6_stub_impl.ipv6_dst_lookup for use cases where sk is not available (like mpls). sk appears to be needed to get the namespace 'net' and is optional otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup to take net argument. sk remains optional. All callers of ipv6_stub_impl.ipv6_dst_lookup have been modified to pass net. I have modified them to use already available 'net' in the scope of the call. I can change them to sock_net(sk) to avoid any unintended change in behaviour if sock namespace is different. They dont seem to be from code inspection. Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2015 1 次提交
-
-
由 Hangbin Liu 提交于
Commit 6fd99094 ("ipv6: Don't reduce hop limit for an interface") disabled accept hop limit from RA if it is smaller than the current hop limit for security stuff. But this behavior kind of break the RFC definition. RFC 4861, 6.3.4. Processing Received Router Advertisements A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time, and Retrans Timer) may contain a value denoting that it is unspecified. In such cases, the parameter should be ignored and the host should continue using whatever value it is already using. If the received Cur Hop Limit value is non-zero, the host SHOULD set its CurHopLimit variable to the received value. So add sysctl option accept_ra_min_hop_limit to let user choose the minimum hop limit value they can accept from RA. And set default to 1 to meet RFC standards. Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Acked-by: NYOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 7月, 2015 3 次提交
-
-
由 Florian Westphal 提交于
We can use union for most of the temporary cruft (original ipv4/ipv6 address, source mac, physoutdev) since they're used during different stages of br netfilter traversal. Also get rid of the last two ->mask users. Shrinks struct from 48 to 32 on 64bit arch. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Eric Dumazet 提交于
This patch is the IPv6 equivalent of commit 6c8b4e3f ("arp: flush arp cache on IFF_NOARP change") Without it, we keep buggy neighbours in the cache, with destination MAC address equal to our own MAC address. Tested: tcpdump -i eth0 -s 0 ip6 -n -e & ip link set dev eth0 arp off ping6 remote // sends buggy frames ip link set dev eth0 arp on ping6 remote // should work once kernel is patched Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NMario Fanelli <mariofanelli@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch creates sk_set_txhash and eliminates protocol specific inet_set_txhash and ip6_set_txhash. sk_set_txhash simply sets a random number instead of performing flow dissection. sk_set_txash is also allowed to be called multiple times for the same socket, we'll need this when redoing the hash for negative routing advice. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 7月, 2015 8 次提交
-
-
由 Martin KaFai Lau 提交于
The patch checks neigh->nud_state before acquiring the writer lock. Note that rt6_probe() is only used in CONFIG_IPV6_ROUTER_PREF. 40 udpflood processes and a /64 gateway route are used. The gateway has NUD_PERMANENT. Each of them is run for 30s. At the end, the total number of finished sendto(): Before: 55M After: 95M Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Julian Anastasov <ja@ssi.bg> CC: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin KaFai Lau 提交于
It is a prep work for the next patch to remove write_lock from rt6_probe(). 1. Reduce the number of if(neigh) check. From 4 to 1. 2. Bring the write_(un)lock() closer to the operations that the lock is protecting. Hopefully, the above make rt6_probe() more readable. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Julian Anastasov <ja@ssi.bg> Cc: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
It saves some lines and simplify a bit the code when the state is returning by this function. It's also useful to handle a NULL entry. To avoid too long lines, I've also renamed lwtunnel_state_get() and lwtunnel_state_put() to lwtstate_get() and lwtstate_put(). CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
We need to copy this field (ip6_rt_cache_alloc() and ip6_rt_pcpu_alloc() use ip6_rt_copy_init() to build a dst). CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 19e42e45 ("ipv6: support for fib route lwtunnel encap attributes") Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
This function make sense only when LWTUNNEL_STATE_OUTPUT_REDIRECT is set. The check is already done in IPv4. CC: Thomas Graf <tgraf@suug.ch> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 74a0f2fe ("ipv6: rt6_info output redirect to tunnel output") Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
We can simply remove the INET_FRAG_EVICTED flag to avoid all the flags race conditions with the evictor and use a participation test for the evictor list, when we're at that point (after inet_frag_kill) in the timer there're 2 possible cases: 1. The evictor added the entry to its evictor list while the timer was waiting for the chainlock or 2. The timer unchained the entry and the evictor won't see it In both cases we should be able to see list_evictor correctly due to the sync on the chainlock. Joint work with Florian Westphal. Tested-by: NFrank Schreuder <fschreuder@transip.nl> Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Followup patch will call it after inet_frag_queue was freed, so q->net doesn't work anymore (but netf = q->net; free(q); mem_limit(netf) would). Tested-by: NFrank Schreuder <fschreuder@transip.nl> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei-Chun Chao 提交于
Similar check was added in ip_rcv but not in ipv6_rcv. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81734e0a>] ipv6_rcv+0xfa/0x500 Call Trace: [<ffffffff816c9786>] ? ip_rcv+0x296/0x400 [<ffffffff817732d2>] ? packet_rcv+0x52/0x410 [<ffffffff8168e99f>] __netif_receive_skb_core+0x63f/0x9a0 [<ffffffffc02b34a0>] ? br_handle_frame_finish+0x580/0x580 [bridge] [<ffffffff8109912c>] ? update_rq_clock.part.81+0x1c/0x40 [<ffffffff8168ed18>] __netif_receive_skb+0x18/0x60 [<ffffffff8168fa1f>] process_backlog+0x9f/0x150 Fixes: ee122c79 (vxlan: Flow based tunneling) Signed-off-by: NWei-Chun Chao <weichunc@plumgrid.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2015 1 次提交
-
-
由 Erik Kline 提交于
Per RFC 6724, section 4, "Candidate Source Addresses": It is RECOMMENDED that the candidate source addresses be the set of unicast addresses assigned to the interface that will be used to send to the destination (the "outgoing" interface). Add a sysctl to enable this behaviour. Signed-off-by: NErik Kline <ek@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 7月, 2015 2 次提交
-
-
由 Roopa Prabhu 提交于
This is similar to ipv4 redirect of dst output to lwtunnel output function for encapsulation and xmit. Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roopa Prabhu 提交于
This patch adds support in ipv6 fib functions to parse Netlink RTA encap attributes and attach encap state data to rt6_info. Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 7月, 2015 2 次提交
-
-
由 Herbert Xu 提交于
This patch reverts 19424e05 ("sit: Add gro callbacks to sit_offload") because it generates packets that cannot be handled even by our own GSO. Reported-by: NWolfgang Walter <linux@stwm.de> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Phil Sutter 提交于
Newly created flows don't have flowi6_oif set (at least if the associated socket is not interface-bound). This leads to a mismatch in __xfrm6_selector_match() for policies which specify an interface in the selector (sel->ifindex != 0). Backtracing shows this happens in code-paths originating from e.g. ip6_datagram_connect(), rawv6_sendmsg() or tcp_v6_connect(). (UDP was not tested for.) In summary, this patch fixes policy matching on outgoing interface for locally generated packets. Signed-off-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 7月, 2015 7 次提交
-
-
由 YOSHIFUJI Hideaki 提交于
Signed-off-by: NYOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 YOSHIFUJI Hideaki/吉藤英明 提交于
Commit 9131f3de ("ipv6: Do not iterate over all interfaces when finding source address on specific interface.") did not properly update best source address available. Plus, it introduced possible NULL pointer dereference. Bug was reported by Erik Kline <ek@google.com>. Based on patch proposed by Hajime Tazaki <thehajime@gmail.com>. Fixes: 9131f3de ("ipv6: Do not iterate over all interfaces when finding source address on specific interface.") Signed-off-by: NYOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Acked-by: NHajime Tazaki <thehajime@gmail.com> Acked-by: NErik Kline <ek@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
ip6_datagram_connect() is doing a lot of socket changes without socket being locked. This looks wrong, at least for udp_lib_rehash() which could corrupt lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
sparse complains: ip_tables.c:361:27: warning: incorrect type in assignment (different modifiers) ip_tables.c:361:27: expected struct ipt_entry *[assigned] e ip_tables.c:361:27: got struct ipt_entry [pure] * doesn't change generated code. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Don't bother testing if we need to switch to alternate stack unless TEE target is used. Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
In most cases there is no reentrancy into ip/ip6tables. For skbs sent by REJECT or SYNPROXY targets, there is one level of reentrancy, but its not relevant as those targets issue an absolute verdict, i.e. the jumpstack can be clobbered since its not used after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc). So the only special case where it is relevant is the TEE target, which returns XT_CONTINUE. This patch changes ip(6)_do_table to always use the jump stack starting from 0. When we detect we're operating on an skb sent via TEE (percpu nf_skb_duplicated is 1) we switch to an alternate stack to leave the original one alone. Since there is no TEE support for arptables, it doesn't need to test if tee is active. The jump stack overflow tests are no longer needed as well -- since ->stacksize is the largest call depth we cannot exceed it. A much better alternative to the external jumpstack would be to just declare a jumps[32] stack on the local stack frame, but that would mean we'd have to reject iptables rulesets that used to work before. Another alternative would be to start rejecting rulesets with a larger call depth, e.g. 1000 -- in this case it would be feasible to allocate the entire stack in the percpu area which would avoid one dereference. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
The {arp,ip,ip6tables} jump stack is currently sized based on the number of user chains. However, its rather unlikely that every user defined chain jumps to the next, so lets use the existing loop detection logic to also track the chain depths. The stacksize is then set to the largest chain depth seen. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 14 7月, 2015 1 次提交
-
-
由 Tom Herbert 提交于
This patch makes the default to build IPv6 into the kernel. IPv6 now has significant traction and any remaining vestiges of IPv6 not being provided parity with IPv4 should be swept away. IPv6 is now core to the Internet and kernel. Points on IPv6 adoption: - Per Google statistics, IPv6 usage has reached 7% on the Internet and continues to exhibit an exponential growth rate https://www.google.com/intl/en/ipv6/statistics.html - Just a few days ago ARIN officially depleted its IPv4 pool - IPv6 only data centers are being successfully built (e.g. at Facebook) This patch changes the IPv6 Kconfig for IPV6. Default for CONFIG_IPV6 is set to "y" and the text has been updated to reflect the maturity of IPv6. Impact: Under some circumstances building modules in to kernel might have a performance advantage. In my testing, I did notice a very slight improvement. This will obviously increase the size of the kernel image. In my configuration I see: IPv6 as module: text data bss dec hex filename 9703666 1899288 933888 12536842 bf4c0a vmlinux IPv6 built into kernel text data bss dec hex filename 9436490 1879600 913408 12229498 ba9b7a vmlinux Which increases text size by ~270K (2.8% increase in size for me). If image size is an issue, presumably for a device which does not do IP networking (IMO we should be discouraging IPv4-only devices), IPV6 can be disabled or still built as a module. Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 7月, 2015 1 次提交
-
-
由 YOSHIFUJI Hideaki/吉藤英明 提交于
If outgoing interface is specified and the candidate address is restricted to the outgoing interface, it is enough to iterate over that given interface only. Signed-off-by: NYOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Acked-by: NErik Kline <ek@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 7月, 2015 4 次提交
-
-
由 Tom Herbert 提交于
Add support to allow non-local binds similar to how this was done for IPv4. Non-local binds are very useful in emulating the Internet in a box, etc. This add the ip_nonlocal_bind sysctl under ipv6. Testing: Set up nonlocal binding and receive routing on a host, e.g.: ip -6 rule add from ::/0 iif eth0 lookup 200 ip -6 route add local 2001:0:0:1::/64 dev lo proto kernel scope host table 200 sysctl -w net.ipv6.ip_nonlocal_bind=1 Set up routing to 2001:0:0:1::/64 on peer to go to first host ping6 -I 2001:0:0:1::1 peer-address -- to verify Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
inet_twsk_deschedule() calls are followed by inet_twsk_put(). Only particular case is in inet_twsk_purge() but there is no point to defer the inet_twsk_put() after re-enabling BH. Lets rename inet_twsk_deschedule() to inet_twsk_deschedule_put() and move the inet_twsk_put() inside. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
timewait sockets have a complex refcounting logic. Once we realize it should be similar to established and syn_recv sockets, we can use sk_nulls_del_node_init_rcu() and remove inet_twsk_unhash() In particular, deferred inet_twsk_put() added in commit 13475a30 ("tcp: connect() race with timewait reuse") looks unecessary : When removing a timewait socket from ehash or bhash, caller must own a reference on the socket anyway. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Hop was always either 0 or sizeof(struct ipv6hdr). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 7月, 2015 2 次提交
-
-
由 Angga 提交于
Before commit daad1512 ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().") MLD packets were only processed locally. After the change, a copy of MLD packet goes through ip6_mr_input, causing MRT6MSG_NOCACHE message to be generated to user space. Make MLD packet only processed locally. Fixes: daad1512 ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().") Signed-off-by: NHermin Anggawijaya <hermin.anggawijaya@alliedtelesis.co.nz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Markus Elfring 提交于
The free_percpu() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-