- 07 6月, 2015 7 次提交
-
-
由 Antonio Quartulli 提交于
This API has to be used to let any routing protocol free neighbor specific allocated resources Signed-off-by: NAntonio Quartulli <antonio@open-mesh.com> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Antonio Quartulli 提交于
Some mesh attributes are behind substructs in the batadv_priv object and for this reason the name cannot be used anymore to refer to them. This patch allows to specify the variable name where the attribute is stored inside batadv_priv instead of using the name Signed-off-by: NAntonio Quartulli <antonio@open-mesh.com> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Sven Eckelmann 提交于
An unoptimized version of the Jenkins one-at-a-time hash function is used and partially copied all over the code wherever an hashtable is used. Instead the optimized version shared between the whole kernel should be used to reduce code duplication and use better optimized code. Only the DAT code must use the old implementation because it is used as distributed hash function which has to be common for all nodes. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Alexei Starovoitov 提交于
allow programs read/write skb->mark, tc_index fields and ((struct qdisc_skb_cb *)cb)->data. mark and tc_index are generically useful in TC. cb[0]-cb[4] are primarily used to pass arguments from one program to another called via bpf_tail_call() which can be seen in sockex3_kern.c example. All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs. mark, tc_index are writeable from tc_cls_act only. cb[0]-cb[4] are writeable by both sockets and tc_cls_act. Add verifier tests and improve sample code. Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data. For ingress L2 header is already pulled, whereas for egress it's present. This is known to program writers which are currently forced to use BPF_LL_OFF workaround. Since programs don't change skb internal pointers it is safe to do pull/push right around invocation of the program and earlier taps and later pt->func() will not be affected. Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick around run_filter/BPF_PROG_RUN even if skb_shared. This fix finally allows programs to use optimized LD_ABS/IND instructions without BPF_LL_OFF for higher performance. tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o w/o JIT w/JIT before 20.5 23.6 Mpps after 21.8 26.6 Mpps Old programs with BPF_LL_OFF will still work as-is. We can now undo most of the earlier workaround commit: a166151c ("bpf: fix bpf helpers to use skb->mac_header relative offsets") Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
For same reasons than in commit 12e25e10 ("tcp: remove redundant checks"), we can remove redundant checks done for timewait sockets. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
When an application needs to force a source IP on an active TCP socket it has to use bind(IP, port=x). As most applications do not want to deal with already used ports, x is often set to 0, meaning the kernel is in charge to find an available port. But kernel does not know yet if this socket is going to be a listener or be connected. It has very limited choices (no full knowledge of final 4-tuple for a connect()) With limited ephemeral port range (about 32K ports), it is very easy to fill the space. This patch adds a new SOL_IP socket option, asking kernel to ignore the 0 port provided by application in bind(IP, port=0) and only remember the given IP address. The port will be automatically chosen at connect() time, in a way that allows sharing a source port as long as the 4-tuples are unique. This new feature is available for both IPv4 and IPv6 (Thanks Neal) Tested: Wrote a test program and checked its behavior on IPv4 and IPv6. strace(1) shows sequences of bind(IP=127.0.0.2, port=0) followed by connect(). Also getsockname() show that the port is still 0 right after bind() but properly allocated after connect(). socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5 setsockopt(5, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 connect(5, {sa_family=AF_INET, sin_port=htons(53174), sin_addr=inet_addr("127.0.0.3")}, 16) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(38050), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0 IPv6 test : socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 7 setsockopt(7, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0 bind(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 connect(7, {sa_family=AF_INET6, sin6_port=htons(57300), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 getsockname(7, {sa_family=AF_INET6, sin6_port=htons(60964), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 I was able to bind()/connect() a million concurrent IPv4 sockets, instead of ~32000 before patch. lpaa23:~# ulimit -n 1000010 lpaa23:~# ./bind --connect --num-flows=1000000 & 1000000 sockets lpaa23:~# grep TCP /proc/net/sockstat TCP: inuse 2000063 orphan 0 tw 47 alloc 2000157 mem 66 Check that a given source port is indeed used by many different connections : lpaa23:~# ss -t src :40000 | head -10 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 0 127.0.0.2:40000 127.0.202.33:44983 ESTAB 0 0 127.0.0.2:40000 127.2.27.240:44983 ESTAB 0 0 127.0.0.2:40000 127.2.98.5:44983 ESTAB 0 0 127.0.0.2:40000 127.0.124.196:44983 ESTAB 0 0 127.0.0.2:40000 127.2.139.38:44983 ESTAB 0 0 127.0.0.2:40000 127.1.59.80:44983 ESTAB 0 0 127.0.0.2:40000 127.3.6.228:44983 ESTAB 0 0 127.0.0.2:40000 127.0.38.53:44983 ESTAB 0 0 127.0.0.2:40000 127.1.197.10:44983 Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2015 11 次提交
-
-
由 Tom Herbert 提交于
In flow dissector if an MPLS header contains an entropy label this is saved in the new keyid field of flow_keys. The entropy label is then represented in the flow hash function input. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
In flow dissector if a GRE header contains a keyid this is saved in the new keyid field of flow_keys. The GRE keyid is then represented in the flow hash function input. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
In flow_dissector set the flow label in flow_keys for IPv6. This also removes the shortcircuiting of flow dissection when a non-zero label is present, the flow label can be considered to provide additional entropy for a hash. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
In flow_dissector set vlan_id in flow_keys when VLAN is found. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
We don't need to return the IPv6 address hash as part of flow keys. In general, using the IPv6 address hash is risky in a hash value since the underlying use of xor provides no entropy. If someone really needs the hash value they can get it from the full IPv6 addresses in flow keys (e.g. from flow_get_u32_src). Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Add a new flow key for TIPC addresses. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch adds full IPv6 addresses into flow_keys and uses them as input to the flow hash function. The implementation supports either IPv4 or IPv6 addresses in a union, and selector is used to determine how may words to input to jhash2. We also add flow_get_u32_dst and flow_get_u32_src functions which are used to get a u32 representation of the source and destination addresses. For IPv6, ipv6_addr_hash is called. These functions retain getting the legacy values of src and dst in flow_keys. With this patch, Ethertype and IP protocol are now included in the flow hash input. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch changes flow hashing to use jhash2 over the flow_keys structure instead just doing jhash_3words over src, dst, and ports. This method will allow us take more input into the hashing function so that we can include full IPv6 addresses, VLAN, flow labels etc. without needing to resort to xor'ing which makes for a poor hash. Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
key_basic is set twice in __skb_flow_dissect which seems unnecessary. Remove second one. Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Do break when we see routing flag or a non-zero version number in GRE header. Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
fix build error: net/core/filter.c: In function 'bpf_clone_redirect': net/core/filter.c:1429:18: error: 'struct sk_buff' has no member named 'tc_verd' if (G_TC_AT(skb2->tc_verd) & AT_INGRESS) Fixes: 3896d655 ("bpf: introduce bpf_clone_redirect() helper") Reported-by: NOr Gerlitz <gerlitz.or@gmail.com> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2015 4 次提交
-
-
由 Wei Liu 提交于
Xen virtual network driver has higher latency than a physical NIC. Having only 128K as limit for TSQ introduced 30% regression in guest throughput. This patch raises the limit to 256K. This reduces the regression to 8%. This buys us more time to work out a proper solution in the long run. Signed-off-by: NWei Liu <wei.liu2@citrix.com> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
tcp_v4_rcv() checks the following before calling tcp_v4_do_rcv(): if (th->doff < sizeof(struct tcphdr) / 4) goto bad_packet; if (!pskb_may_pull(skb, th->doff * 4)) goto discard_it; So following check in tcp_v4_do_rcv() is redundant and "goto csum_err;" is wrong anyway. if (skb->len < tcp_hdrlen(skb) || ...) goto csum_err; A second check can be removed after no_tcp_socket label for same reason. Same tests can be removed in tcp_v6_do_rcv() Note : short tcp frames are not properly accounted in tcpInErrs MIB, because pskb_may_pull() failure simply drops incoming skb, we might fix this in a separate patch. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Scott Feldman 提交于
Clarify in documentation and code that IPV4 FIB add operation is used for both adding a new FIB entry to the device and for modifying an existing FIB entry on the device. Also, remove left-over references to ipv4_fib ops and replace with details on SWITCHDEV_PORT_IPV4_FIB object. Signed-off-by: NScott Feldman <sfeldma@gmail.com> Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
Allow eBPF programs attached to classifier/actions to call bpf_clone_redirect(skb, ifindex, flags) helper which will mirror or redirect the packet by dynamic ifindex selection from within the program to a target device either at ingress or at egress. Can be used for various scenarios, for example, to load balance skbs into veths, split parts of the traffic to local taps, etc. Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2015 12 次提交
-
-
由 Markus Pargmann 提交于
Remove ret variable and all jumps. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
We can avoid this indirect return variable by directly returning the error values. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
Declare the returntype of batadv_compare_eth as bool. The function called inside this helper function (ether_addr_equal_unaligned) also uses bool as return value, so there is no need to return int. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
It is much clearer to see a bool type as return value than 'int' for functions that are supposed to return true or false. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Sven Eckelmann 提交于
orig_ifinfo is dereferenced multiple times in batadv_iv_ogm_update_seqnos before the check for NULL is done. The function also exists at the beginning when orig_ifinfo would have been NULL. This makes the check at the end unnecessary and only confuses the reader/code analyzers. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
batadv_orig_bat_iv->bcast_own is actually not a bitfield, it is an array. Adjust the comment accordingly. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
This is a small copy paste fix for batadv_ing_buffer_avg. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
The kernel coding style says, that there should not be multiple assignments in one row. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
CodingStyle describes that either none or both branches of a conditional have to have brackets. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
由 Markus Pargmann 提交于
It is just a bit easier to put the error handling at one place and let multiple error paths use the same calls. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
- 02 6月, 2015 6 次提交
-
-
由 Toshiaki Makita 提交于
Currently packets with non-hardware-accelerated vlan cannot be handled by GRO. This causes low performance for 802.1ad and stacked vlan, as their vlan tags are currently not stripped by hardware. This patch adds GRO support for non-hardware-accelerated vlan and improves receive performance of them. Test Environment: vlan device (.1Q) on vlan device (.1ad) on ixgbe (82599) Result: - Before $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.00 5233.17 Rx side CPU usage: %usr %sys %irq %soft %idle 0.27 58.03 0.00 41.70 0.00 - After $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.00 7586.85 Rx side CPU usage: %usr %sys %irq %soft %idle 0.50 25.83 0.00 59.53 14.14 [ Register VLAN offloads with priority 10 -DaveM ] Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
We currently rely on the PMTU discovery of xfrm. However if a packet is localy sent, the PMTU mechanism of xfrm tries to to local socket notification what might not work for applications like ping that don't check for this. So add pmtu handling to vti6_xmit to report MTU changes immediately. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neil McKee 提交于
If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions in the upcall. This Directly associates the sampled packet with the path it takes through the virtual switch. Path information currently includes mangling, encapsulation and decapsulation actions for tunneling protocols GRE, VXLAN, Geneve, MPLS and QinQ, but this extension requires no further changes to accommodate datapath actions that may be added in the future. Adding path information enhances visibility into complex virtual networks. Signed-off-by: NNeil McKee <neil.mckee@inmon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When we scan a packet for GRO processing, we want to see the most common packet types in the front of the offload_base list. So add a priority field so we can handle this properly. IPv4/IPv6 get the highest priority with the implicit zero priority field. Next comes ethernet with a priority of 10, and then we have the MPLS types with a priority of 15. Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Suggested-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit f96dee13. It isn't right, ethtool is meant to manage one PHY instance per netdevice at a time, and this is selected by the SET command. Therefore by definition the GET command must only return the settings for the configured and selected PHY. Reported-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bernhard Thaler 提交于
This partially reverts commit 1086bbe9 ("netfilter: ensure number of counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c. Setting rules with ebtables does not work any more with 1086bbe9 place. There is an error message and no rules set in the end. e.g. ~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP Unable to update the kernel. Two possible causes: 1. Multiple ebtables programs were executing simultaneously. The ebtables userspace tool doesn't by default support multiple ebtables programs running Reverting the ebtables part of 1086bbe9 makes this work again. Signed-off-by: NBernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-