- 12 7月, 2014 3 次提交
-
-
由 Himangi Saraogi 提交于
The semantic patch that makes this transformation is as follows: // <smpl> @@ expression e; @@ -if (e) BUG(); +BUG_ON(e); // </smpl> Signed-off-by: NHimangi Saraogi <himangi774@gmail.com> Acked-by: NJulia Lawall <julia.lawall@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Himangi Saraogi 提交于
The semantic patch that makes the transformation is as follows: // <smpl> @@ expression e; @@ -if (e) BUG(); +BUG_ON(e); // </smpl> Signed-off-by: NHimangi Saraogi <himangi774@gmail.com> Acked-by: NJulia Lawall <julia.lawall@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch introduces a possibility for userspace to set various (so far two) modes of generating addresses. This is useful for example for NetworkManager because it can set the mode to NONE and take care of link local addresses itself. That allow it to have the interface up, monitoring carrier but still don't have any addresses on it. One more use-case by Dan Williams: <quote> WWAN devices often have their LL address provided by the firmware of the device, which sometimes refuses to respond to incorrect LL addresses when doing DHCPv6 or IPv6 ND. The kernel cannot generate the correct LL address for two reasons: 1) WWAN pseudo-ethernet interfaces often construct a fake MAC address, or read a meaningless MAC address from the firmware. Thus the EUI64 and the IPv6LL address the kernel assigns will be wrong. The real LL address is often retrieved from the firmware with AT or proprietary commands. 2) WWAN PPP interfaces receive their LL address from IPV6CP, not from kernel assignments. Only after IPV6CP has completed do we know the LL address of the PPP interface and its peer. But the kernel has already assigned an incorrect LL address to the interface. So being able to suppress the kernel LL address generation and assign the one retrieved from the firmware is less complicated and more robust. </quote> Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 7月, 2014 1 次提交
-
-
由 Li RongQing 提交于
d9333196(ipv6: Allow accepting RA from local IP addresses.) made the wrong check, whether or not to accept RA with source-addr found on local machine, when accept_ra_from_local is 0. Fixes: d9333196(ipv6: Allow accepting RA from local IP addresses.) Cc: Ben Greear <greearb@candelatech.com> Cc: Hannes Frederic Sowa <hannes@redhat.com> Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 7月, 2014 3 次提交
-
-
由 Tom Herbert 提交于
Automatically generate flow labels for IPv6 packets on transmit. The flow label is computed based on skb_get_hash. The flow label will only automatically be set when it is zero otherwise (i.e. flow label manager hasn't set one). This supports the transmit side functionality of RFC 6438. Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this functionality per socket. By default, auto flowlabels are disabled to avoid possible conflicts with flow label manager, however if this feature proves useful we may want to enable it by default. It should also be noted that FreeBSD has already implemented automatic flow labels (including the sysctl and socket option). In FreeBSD, automatic flow labels default to enabled. Performance impact: Running super_netperf with 200 flows for TCP_RR and UDP_RR for IPv6. Note that in UDP case, __skb_get_hash will be called for every packet with explains slight regression. In the TCP case the hash is saved in the socket so there is no regression. Automatic flow labels disabled: TCP_RR: 86.53% CPU utilization 127/195/322 90/95/99% latencies 1.40498e+06 tps UDP_RR: 90.70% CPU utilization 118/168/243 90/95/99% latencies 1.50309e+06 tps Automatic flow labels enabled: TCP_RR: 85.90% CPU utilization 128/199/337 90/95/99% latencies 1.40051e+06 UDP_RR 92.61% CPU utilization 115/164/236 90/95/99% latencies 1.4687e+06 Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
For a connected socket we can precompute the flow hash for setting in skb->hash on output. This is a performance advantage over calculating the skb->hash for every packet on the connection. The computation is done using the common hash algorithm to be consistent with computations done for packets of the connection in other states where thers is no socket (e.g. time-wait, syn-recv, syn-cookies). This patch adds sk_txhash to the sock structure. inet_set_txhash and ip6_set_txhash functions are added which are called from points in TCP and UDP where socket moves to established state. skb_set_hash_from_sk is a function which sets skb->hash from the sock txhash value. This is called in UDP and TCP transmit path when transmitting within the context of a socket. Tested: ran super_netperf with 200 TCP_RR streams over a vxlan interface (in this case skb_get_hash called on every TX packet to create a UDP source port). Before fix: 95.02% CPU utilization 154/256/505 90/95/99% latencies 1.13042e+06 tps Time in functions: 0.28% skb_flow_dissect 0.21% __skb_get_hash After fix: 94.95% CPU utilization 156/254/485 90/95/99% latencies 1.15447e+06 Neither __skb_get_hash nor skb_flow_dissect appear in perf Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
Always store in snt_synack the time at which the server received the first client SYN and attempted to send the first SYNACK. Recent commit aa27fc50 ("tcp: tcp_v[46]_conn_request: fix snt_synack initialization") resolved an inconsistency between IPv4 and IPv6 in the initialization of snt_synack. This commit brings back the idea from 843f4a55 (tcp: use tcp_v4_send_synack on first SYN-ACK), which was going for the original behavior of snt_synack from the commit where it was added in 9ad7c049 ("tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side") in v3.1. In addition to being simpler (and probably a tiny bit faster), unconditionally storing the time of the first SYNACK attempt has been useful because it allows calculating a performance metric quantifying how long it took to establish a passive TCP connection. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Cc: Octavian Purdila <octavian.purdila@intel.com> Cc: Jerry Chu <hkchu@google.com> Acked-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 7月, 2014 3 次提交
-
-
由 Eric Dumazet 提交于
When an UDP application switches from AF_INET to AF_INET6 sockets, we have a small performance degradation for IPv4 communications because of extra cache line misses to access ipv6only information. This can also be noticed for TCP listeners, as ipv6_only_sock() is also used from __inet_lookup_listener()->compute_score() This is magnified when SO_REUSEPORT is used. Move ipv6only into struct sock_common so that it is available at no extra cost in lookups. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Greear 提交于
This can be used in virtual networking applications, and may have other uses as well. The option is disabled by default. A specific use case is setting up virtual routers, bridges, and hosts on a single OS without the use of network namespaces or virtual machines. With proper use of ip rules, routing tables, veth interface pairs and/or other virtual interfaces, and applications that can bind to interfaces and/or IP addresses, it is possibly to create one or more virtual routers with multiple hosts attached. The host interfaces can act as IPv6 systems, with radvd running on the ports in the virtual routers. With the option provided in this patch enabled, those hosts can now properly obtain IPv6 addresses from the radvd. Signed-off-by: NBen Greear <greearb@candelatech.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Greear 提交于
This is disabled by default, just like similar debug info already in this module. But, makes it easier to find out why RA is not being accepted when debugging strange behaviour. Signed-off-by: NBen Greear <greearb@candelatech.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 6月, 2014 12 次提交
-
-
由 Octavian Purdila 提交于
Create tcp_conn_request and remove most of the code from tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Add queue_add_hash member to tcp_request_sock_ops so that we can later unify tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Add mss_clamp member to tcp_request_sock_ops so that we can later unify tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Create a new tcp_request_sock_ops method to unify the IPv4/IPv6 signature for tcp_v[46]_send_synack. This allows us to later unify tcp_v4_rtx_synack with tcp_v6_rtx_synack and tcp_v4_conn_request with tcp_v4_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
More work in preparation of unifying tcp_v4_conn_request and tcp_v6_conn_request: indirect the init sequence calls via the tcp_request_sock_ops. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Make the tcp_v6_conn_request calls flow similar with that of tcp_v4_conn_request. Note that want_cookie can be true only if isn is zero and that is why we can move the if (want_cookie) block out of the if (!isn) block. Moving security_inet_conn_request() has a couple of side effects: missing inet_rsk(req)->ecn_ok update and the req->cookie_ts update. However, neither SELinux nor Smack security hooks seems to check them. This change should also avoid future different behaviour for IPv4 and IPv6 in the security hooks. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Acked-by: NPaul Moore <paul@paul-moore.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Create wrappers with same signature for the IPv4/IPv6 request routing calls and use these wrappers (via route_req method from tcp_request_sock_ops) in tcp_v4_conn_request and tcp_v6_conn_request with the purpose of unifying the two functions in a later patch. We can later drop the wrapper functions and modify inet_csk_route_req and inet6_cks_route_req to use the same signature. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Move the specific IPv4/IPv6 cookie sequence initialization to a new method in tcp_request_sock_ops in preparation for unifying tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Move the specific IPv4/IPv6 intializations to a new method in tcp_request_sock_ops in preparation for unifying tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Since pktops is only used for IPv6 only and opts is used for IPv4 only, we can move these fields into a union and this allows us to drop the inet6_reqsk_alloc function as after this change it becomes equivalent with inet_reqsk_alloc. This patch also fixes a kmemcheck issue in the IPv6 stack: the flags field was not annotated after a request_sock was allocated. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
Commit 016818d0 (tcp: TCP Fast Open Server - take SYNACK RTT after completing 3WHS) changes the code to only take a snt_synack timestamp when a SYNACK transmit or retransmit succeeds. This behaviour is later broken by commit 843f4a55 (tcp: use tcp_v4_send_synack on first SYN-ACK), as snt_synack is now updated even if tcp_v4_send_synack fails. Also, commit 3a19ce0e (tcp: IPv6 support for fastopen server) misses the required IPv6 updates for 016818d0. This patch makes sure that snt_synack is updated only when the SYNACK trasnmit/retransmit succeeds, for both IPv4 and IPv6. Cc: Cardwell <ncardwell@google.com> Cc: Daniel Lee <longinus00@gmail.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 6月, 2014 1 次提交
-
-
由 Octavian Purdila 提交于
ir_mark initialization is done for both TCP v4 and v6, move it in the common tcp_openreq_init function. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2014 2 次提交
-
-
由 huizhang 提交于
Bug report on https://bugzilla.kernel.org/show_bug.cgi?id=75781 When a local output ipsec packet match the mangle table rule, and be set mark value, the packet will be route again in route_me_harder -> _session_decoder6 In this case, the nhoff in CB of skb was still the default value 0. So the protocal match can't success and the packet can't match correct SA rule,and then the packet be send out in plaintext. To fixed up the issue. The CB->nhoff must be set. Signed-off-by: NHui Zhang <huizhang@marvell.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dmitry Popov 提交于
ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t->dev is a tunnel netdevice). It caused wrong route lookup and failure of pmtu update or redirect. We should use the same ifindex that we use in ip_route_output_* in *tunnel_xmit code. It is t->parms.link . Signed-off-by: NDmitry Popov <ixaphire@qrator.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 6月, 2014 1 次提交
-
-
由 Sven Wegener 提交于
To avoid the confusion of having two variables, shrink the function to only use the parameter variable for looping. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2014 4 次提交
-
-
由 Tom Herbert 提交于
Call gso_make_checksum. This should have the benefit of using a checksum that may have been previously computed for the packet. This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that offload GRE GSO with and without the GRE checksum offloaed. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates that a device is capable of computing the UDP checksum in the encapsulating header of a UDP tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added udp_set_csum and udp6_set_csum functions to set UDP checksums in packets. These are for simple UDP packets such as those that might be created in UDP tunnels. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sven Wegener 提交于
Commit efe4208f ("ipv6: make lookups simpler and faster") introduced a regression in udp_v6_mcast_next(), resulting in multicast packets not reaching the destination sockets under certain conditions. The packet's IPv6 addresses are wrongly compared to the IPv6 addresses from the function's socket argument, which indicates the starting point for looping, instead of the loop variable. If the addresses from the first socket do not match the packet's addresses, no socket in the list will match. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2014 2 次提交
-
-
由 Eric Dumazet 提交于
I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery is disabled. Note how GSO/TSO packets do not have monotonically incrementing ID. 06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396) 06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212) 06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972) 06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292) 06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764) It appears I introduced this bug in linux-3.1. inet_getid() must return the old value of peer->ip_id_count, not the new one. Lets revert this part, and remove the prevention of a null identification field in IPv6 Fragment Extension Header, which is dubious and not even done properly. Fixes: 87c48fa3 ("ipv6: make fragment identifications less predictable") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Ideally, we would need to generate IP ID using a per destination IP generator. linux kernels used inet_peer cache for this purpose, but this had a huge cost on servers disabling MTU discovery. 1) each inet_peer struct consumes 192 bytes 2) inetpeer cache uses a binary tree of inet_peer structs, with a nominal size of ~66000 elements under load. 3) lookups in this tree are hitting a lot of cache lines, as tree depth is about 20. 4) If server deals with many tcp flows, we have a high probability of not finding the inet_peer, allocating a fresh one, inserting it in the tree with same initial ip_id_count, (cf secure_ip_id()) 5) We garbage collect inet_peer aggressively. IP ID generation do not have to be 'perfect' Goal is trying to avoid duplicates in a short period of time, so that reassembly units have a chance to complete reassembly of fragments belonging to one message before receiving other fragments with a recycled ID. We simply use an array of generators, and a Jenkin hash using the dst IP as a key. ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it belongs (it is only used from this file) secure_ip_id() and secure_ipv6_id() no longer are needed. Rename ip_select_ident_more() to ip_select_ident_segs() to avoid unnecessary decrement/increment of the number of segments. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 5月, 2014 3 次提交
-
-
由 Tom Herbert 提交于
RFC 6935 permits zero checksums to be used in IPv6 however this is recommended only for certain tunnel protocols, it does not make checksums completely optional like they are in IPv4. This patch restricts the use of IPv6 zero checksums that was previously intoduced. no_check6_tx and no_check6_rx have been added to control the use of checksums in UDP6 RX and TX path. The normal sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when dealing with a dual stack socket). A helper function has been added (udp_set_no_check6) which can be called by tunnel impelmentations to all zero checksums (send on the socket, and accept them as valid). Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Define separate fields in the sock structure for configuring disabling checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx. The SO_NO_CHECK socket option only affects sk_no_check_tx. Also, removed UDP_CSUM_* defines since they are no longer necessary. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
It doesn't seem like an protocols are setting anything other than the default, and allowing to arbitrarily disable checksums for a whole protocol seems dangerous. This can be done on a per socket basis. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 5月, 2014 3 次提交
-
-
由 Eric Dumazet 提交于
When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling broke on GRE+IPv6 because we did not update/use the appropriate csum : GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of skb->csum Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens at the first level (ethernet device) instead of being done in gre tunnel. Native IPv6+TCP is still properly aggregated. Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
entries is always greater than rt_max_size here, since if entries is less than rt_max_size, the fib6_run_gc function will be skipped Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Gundersen 提交于
Enable the module alias hookup to allow tunnel modules to be autoloaded on demand. This is in line with how most other netdev kinds work, and will allow userspace to create tunnels without having CAP_SYS_MODULE. Signed-off-by: NTom Gundersen <teg@jklm.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 5月, 2014 1 次提交
-
-
由 Lorenzo Colitti 提交于
Currently, "ip -6 route get mark xyz" ignores the mark passed in by userspace. Make it honour the mark, just like IPv4 does. Signed-off-by: NLorenzo Colitti <lorenzo@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 5月, 2014 1 次提交
-
-
由 Duan Jiong 提交于
RFC 4861 states in 7.2.5: The IsRouter flag in the cache entry MUST be set based on the Router flag in the received advertisement. In those cases where the IsRouter flag changes from TRUE to FALSE as a result of this update, the node MUST remove that router from the Default Router List and update the Destination Cache entries for all destinations using that neighbor as a router as specified in Section 7.3.3. This is needed to detect when a node that is used as a router stops forwarding packets due to being configured as a host. Currently, when dealing with NA Message which IsRouter flag changes from TRUE to FALSE, the kernel only removes router from the Default Router List, and don't update the Destination Cache entries. Now in order to update those Destination Cache entries, i introduce function rt6_clean_tohost(). Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-