- 09 9月, 2014 14 次提交
-
-
由 Arturo Borrero 提交于
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv6; a similar patch exists to handle IPv4. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv4; a similar patch follows to handle IPv6. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Nicolas Dichtel 提交于
This is already done for x_tables (family AF_INET and AF_INET6), let's do it for AF_BRIDGE also. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
Both SNAT and DNAT (and the upcoming masquerade) can have additional configuration parameters, such as port randomization and NAT addressing persistence. We can cover these scenarios by simply adding a flag attribute for userspace to fill when needed. The flags to use are defined in include/uapi/linux/netfilter/nf_nat.h: NF_NAT_RANGE_MAP_IPS NF_NAT_RANGE_PROTO_SPECIFIED NF_NAT_RANGE_PROTO_RANDOM NF_NAT_RANGE_PERSISTENT NF_NAT_RANGE_PROTO_RANDOM_FULLY NF_NAT_RANGE_PROTO_RANDOM_ALL The caller must take care of not messing up with the flags, as they are added unconditionally to the final resulting nf_nat_range. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This patch extend the NFT_MSG_DELTABLE call to support flushing the entire ruleset. The options now are: * No family speficied, no table specified: flush all the ruleset. * Family specified, no table specified: flush all tables in the AF. * Family specified, table specified: flush the given table. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This patch refactor the code to schedule objects deletion. They are useful in follow-up patches. In order to be able to use these new helper functions in all the code, they are placed in the top of the file, with all the dependant functions and symbols. nft_rule_disactivate_next has been renamed to nft_rule_deactivate. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Bojan Prtvar 提交于
The skb_find_text() accepts uninitialized textsearch state variable. Signed-off-by: NBojan Prtvar <prtvar.b@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
Use union to reserve the required stack space for sockopt data which is less than the currently hardcoded value of 128. Now the tables for commands should be more readable. The checks added for readability are optimized by compiler, others warn at compile time if command uses too much stack or exceeds the storage of set_arglen and get_arglen. As Dan Carpenter points out, we can run for unprivileged user, so we can silent some error messages. Signed-off-by: NJulian Anastasov <ja@ssi.bg> CC: Dan Carpenter <dan.carpenter@oracle.com> CC: Andrey Utkin <andrey.krieger.utkin@gmail.com> CC: David Binderman <dcb314@hotmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Ana Rey 提交于
Add devgroup support to let us match device group of a packets incoming or outgoing interface. Signed-off-by: NAna Rey <anarey@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
For the sake of homogenize the function naming scheme, let's rename nf_table_delrule_by_chain() to nft_delrule_by_chain(). Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This patch adds a helper function to unregister chain hooks in the chain deletion path. Basically, a code factorization. The new function is useful in follow-up patches. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This helper function always schedule the rule to be removed in the following transaction. In follow-up patches, it is interesting to handle separately the logic of rule activation/disactivation from the transaction mechanism. So, this patch simply splits the original nf_tables_delrule_one() in two functions, allowing further control. While at it, for the sake of homigeneize the function naming scheme, let's rename nf_tables_delrule_one() to nft_delrule(). Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Use the exported IPv6 NAT functions that are provided by the core. This removes duplicated code so iptables and nft use the same NAT codebase. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Move the specific NAT IPv6 core functions that are called from the hooks from ip6table_nat.c to nf_nat_l3proto_ipv6.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. This also renames nf_nat_ipv6_fn to nft_nat_ipv6_fn in net/ipv6/netfilter/nft_chain_nat_ipv6.c to avoid a compilation breakage. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 02 9月, 2014 2 次提交
-
-
由 Pablo Neira Ayuso 提交于
Use the exported IPv4 NAT functions that are provided by the core. This removes duplicated code so iptables and nft use the same NAT codebase. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Move the specific NAT IPv4 core functions that are called from the hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 27 8月, 2014 1 次提交
-
-
由 Alexey Perevalov 提交于
You can use this to skip accounting objects when listing/resetting via NFNL_MSG_ACCT_GET/NFNL_MSG_ACCT_GET_CTRZERO messages with the NLM_F_DUMP netlink flag. The filtering covers the following cases: 1. No filter specified. In this case, the client will get old behaviour, 2. List/reset counter object only: In this case, you have to use NFACCT_F_QUOTA as mask and value 0. 3. List/reset quota objects only: You have to use NFACCT_F_QUOTA_PKTS as mask and value - the same, for byte based quota mask should be NFACCT_F_QUOTA_BYTES and value - the same. If you want to obtain the object with any quota type (ie. NFACCT_F_QUOTA_PKTS|NFACCT_F_QUOTA_BYTES), you need to perform two dump requests, one to obtain NFACCT_F_QUOTA_PKTS objects and another for NFACCT_F_QUOTA_BYTES. Signed-off-by: NAlexey Perevalov <a.perevalov@samsung.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 25 8月, 2014 4 次提交
-
-
由 Jozsef Kadlecsik 提交于
Dan Carpenter reported that the static checker emits the warning net/netfilter/ipset/ip_set_list_set.c:600 init_list_set() warn: integer overflows 'sizeof(*map) + size * set->dsize' Limit the maximal number of elements in list type of sets. Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
-
由 Mark Rustad 提交于
Resolve missing-field-initializer warnings by providing a directed initializer. Signed-off-by: NMark Rustad <mark.d.rustad@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
-
由 Sergey Popovich 提交于
Ranges of values are broken with hash:net,net and hash:net,port,net. hash:net,net ============ # ipset create test-nn hash:net,net # ipset add test-nn 10.0.10.1-10.0.10.127,10.0.0.0/8 # ipset list test-nn Name: test-nn Type: hash:net,net Revision: 0 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 16960 References: 0 Members: 10.0.10.1,10.0.0.0/8 # ipset test test-nn 10.0.10.65,10.0.0.1 10.0.10.65,10.0.0.1 is NOT in set test-nn. # ipset test test-nn 10.0.10.1,10.0.0.1 10.0.10.1,10.0.0.1 is in set test-nn. hash:net,port,net ================= # ipset create test-npn hash:net,port,net # ipset add test-npn 10.0.10.1-10.0.10.127,tcp:80,10.0.0.0/8 # ipset list test-npn Name: test-npn Type: hash:net,port,net Revision: 0 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 17344 References: 0 Members: 10.0.10.8/29,tcp:80,10.0.0.0 10.0.10.16/28,tcp:80,10.0.0.0 10.0.10.2/31,tcp:80,10.0.0.0 10.0.10.64/26,tcp:80,10.0.0.0 10.0.10.32/27,tcp:80,10.0.0.0 10.0.10.4/30,tcp:80,10.0.0.0 10.0.10.1,tcp:80,10.0.0.0 # ipset list test-npn # ipset test test-npn 10.0.10.126,tcp:80,10.0.0.2 10.0.10.126,tcp:80,10.0.0.2 is NOT in set test-npn. # ipset test test-npn 10.0.10.126,tcp:80,10.0.0.0 10.0.10.126,tcp:80,10.0.0.0 is in set test-npn. # ipset create test-npn hash:net,port,net # ipset add test-npn 10.0.10.0/24,tcp:80-81,10.0.0.0/8 # ipset list test-npn Name: test-npn Type: hash:net,port,net Revision: 0 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 17024 References: 0 Members: 10.0.10.0,tcp:80,10.0.0.0 10.0.10.0,tcp:81,10.0.0.0 # ipset test test-npn 10.0.10.126,tcp:80,10.0.0.0 10.0.10.126,tcp:80,10.0.0.0 is NOT in set test-npn. # ipset test test-npn 10.0.10.0,tcp:80,10.0.0.0 10.0.10.0,tcp:80,10.0.0.0 is in set test-npn. Correctly setup from..to variables where no IPSET_ATTR_IP_TO{,2} attribute is given, so in range processing loop we construct proper cidr value. Check whenever we have no ranges and can short cut in hash:net,net properly. Use unlikely() where appropriate, to comply with other modules. Signed-off-by: NSergey Popovich <popovich_sergei@mail.ru> Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
-
由 Vytas Dauksa 提交于
Markmask is an u32, hence it can't be greater then 4294967295 ( i.e. 0xffffffff ). This was causing smatch warning: net/netfilter/ipset/ip_set_hash_gen.h:1084 hash_ipmark_create() warn: impossible condition '(markmask > 4294967295) => (0-u32max > u32max)' Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
-
- 24 8月, 2014 2 次提交
-
-
由 Ana Rey 提交于
Add cpu support to meta expresion. This allows you to match packets with cpu number. Signed-off-by: NAna Rey <anarey@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Ana Rey 提交于
Add pkttype support for ip, ipv6 and inet families of tables. This allows you to fetch the meta packet type based on the link layer information. The loopback traffic is a special case, the packet type is guessed from the network layer header. No special handling for bridge and arp since we're not going to see such traffic in the loopback interface. Joint work with Alvaro Neira Ayuso <alvaroneay@gmail.com> Signed-off-by: NAlvaro Neira Ayuso <alvaroneay@gmail.com> Signed-off-by: NAna Rey <anarey@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 15 8月, 2014 6 次提交
-
-
由 Thomas Graf 提交于
Silences the following sparse warnings: net/netlink/af_netlink.c:2926:21: warning: context imbalance in 'netlink_seq_start' - wrong count at exit net/netlink/af_netlink.c:2972:13: warning: context imbalance in 'netlink_seq_stop' - unexpected unlock Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
Fix TCP FRTO logic so that it always notices when snd_una advances, indicating that any RTO after that point will be a new and distinct loss episode. Previously there was a very specific sequence that could cause FRTO to fail to notice a new loss episode had started: (1) RTO timer fires, enter FRTO and retransmit packet 1 in write queue (2) receiver ACKs packet 1 (3) FRTO sends 2 more packets (4) RTO timer fires again (should start a new loss episode) The problem was in step (3) above, where tcp_process_loss() returned early (in the spot marked "Step 2.b"), so that it never got to the logic to clear icsk_retransmits. Thus icsk_retransmits stayed non-zero. Thus in step (4) tcp_enter_loss() would see the non-zero icsk_retransmits, decide that this RTO is not a new episode, and decide not to cut ssthresh and remember the current cwnd and ssthresh for undo. There were two main consequences to the bug that we have observed. First, ssthresh was not decreased in step (4). Second, when there was a series of such FRTO (1-4) sequences that happened to be followed by an FRTO undo, we would restore the cwnd and ssthresh from before the entire series started (instead of the cwnd and ssthresh from before the most recent RTO). This could result in cwnd and ssthresh being restored to values much bigger than the proper values. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Fixes: e33099f9 ("tcp: implement RFC5682 F-RTO") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
tcp_tw_recycle heavily relies on tcp timestamps to build a per-host ordering of incoming connections and teardowns without the need to hold state on a specific quadruple for TCP_TIMEWAIT_LEN, but only for the last measured RTO. To do so, we keep the last seen timestamp in a per-host indexed data structure and verify if the incoming timestamp in a connection request is strictly greater than the saved one during last connection teardown. Thus we can verify later on that no old data packets will be accepted by the new connection. During moving a socket to time-wait state we already verify if timestamps where seen on a connection. Only if that was the case we let the time-wait socket expire after the RTO, otherwise normal TCP_TIMEWAIT_LEN will be used. But we don't verify this on incoming SYN packets. If a connection teardown was less than TCP_PAWS_MSL seconds in the past we cannot guarantee to not accept data packets from an old connection if no timestamps are present. We should drop this SYN packet. This patch closes this loophole. Please note, this patch does not make tcp_tw_recycle in any way more usable but only adds another safety check: Sporadic drops of SYN packets because of reordering in the network or in the socket backlog queues can happen. Users behing NAT trying to connect to a tcp_tw_recycle enabled server can get caught in blackholes and their connection requests may regullary get dropped because hosts behind an address translator don't have synchronized tcp timestamp clocks. tcp_tw_recycle cannot work if peers don't have tcp timestamps enabled. In general, use of tcp_tw_recycle is disadvised. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
Make sure we use the correct address-family-specific function for handling MTU reductions from within tcp_release_cb(). Previously AF_INET6 sockets were incorrectly always using the IPv6 code path when sometimes they were handling IPv4 traffic and thus had an IPv4 dst. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Diagnosed-by: NWillem de Bruijn <willemb@google.com> Fixes: 563d34d0 ("tcp: dont drop MTU reduction indications") Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shmulik Ladkani 提交于
As of 4fddbf5d ("sit: strictly restrict incoming traffic to tunnel link device"), when looking up a tunnel, tunnel's underlying interface (t->parms.link) is verified to match incoming traffic's ingress device. However the comparison was incorrectly based on skb->dev->iflink. Instead, dev->ifindex should be used, which correctly represents the interface from which the IP stack hands the ipip6 packets. This allows setting up sit tunnels bound to vlan interfaces (otherwise incoming ipip6 traffic on the vlan interface was dropped due to ipip6_tunnel_lookup match failure). Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrey Vagin 提交于
We don't know right timestamp for repaired skb-s. Wrong RTT estimations isn't good, because some congestion modules heavily depends on it. This patch adds the TCPCB_REPAIRED flag, which is included in TCPCB_RETRANS. Thanks to Eric for the advice how to fix this issue. This patch fixes the warning: [ 879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380() [ 879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1 [ 879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 879.568177] 0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2 [ 879.568776] 0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80 [ 879.569386] ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc [ 879.569982] Call Trace: [ 879.570264] [<ffffffff817aa2d2>] dump_stack+0x4d/0x66 [ 879.570599] [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0 [ 879.570935] [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20 [ 879.571292] [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380 [ 879.571614] [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710 [ 879.571958] [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370 [ 879.572315] [<ffffffff81657459>] release_sock+0x89/0x1d0 [ 879.572642] [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860 [ 879.573000] [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80 [ 879.573352] [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40 [ 879.573678] [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20 [ 879.574031] [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0 [ 879.574393] [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b [ 879.574730] ---[ end trace a17cbc38eb8c5c00 ]--- v2: moving setting of skb->when for repaired skb-s in tcp_write_xmit, where it's set for other skb-s. Fixes: 431a9124 ("tcp: timestamp SYN+DATA messages") Fixes: 740b0f18 ("tcp: switch rtt estimations to usec resolution") Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrey Vagin <avagin@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 8月, 2014 6 次提交
-
-
由 Willem de Bruijn 提交于
Bytestream timestamps are correlated with a single byte in the skbuff, recorded in skb_shinfo(skb)->tskey. When fragmenting skbuffs, ensure that the tskey is set for the fragment in which the tskey falls (seqno <= tskey < end_seqno). The original implementation did not address fragmentation in tcp_fragment or tso_fragment. Add code to inspect the sequence numbers and move both tskey and the relevant tx_flags if necessary. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
ACK timestamps are generated in tcp_clean_rtx_queue. The TSO datapath can break out early, causing the timestamp code to be skipped. Move the code up before the break. Reported-by: NDavid S. Miller <davem@davemloft.net> Also fix a boundary condition: tp->snd_una is the next unacknowledged byte and between tests inclusive (a <= b <= c), so generate a an ACK timestamp if (prior_snd_una <= tskey <= tp->snd_una - 1). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maks Naumov 提交于
Signed-off-by: NMaks Naumov <maksqwe1@ukr.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
b67bfe0d (hlist: drop the node parameter from iterators) dropped the node parameter from iterators which lec_tbl_walk() was using to iterate the list. Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
One should not call blocking primitives inside a wait loop, since both require task_struct::state to sleep, so the inner will destroy the outer state. sigd_enq() will possibly sleep for alloc_skb(). Move sigd_enq() before prepare_to_wait() to avoid sleeping while waiting interruptibly. You do not actually need to call sigd_enq() after the initial prepare_to_wait() because we test the termination condition before calling schedule(). Based on suggestions from Peter Zijlstra. Signed-off-by: NChas Williams <chas@cmf.n4rl.navy.mil> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Jaeger 提交于
ovs_vport_alloc() bails out without freeing the memory 'vport' points to. Picked up by Coverity - CID 1230503. Fixes: 5cd667b0 ("openvswitch: Allow each vport to have an array of 'port_id's.") Signed-off-by: NChristoph Jaeger <cj@linux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 8月, 2014 1 次提交
-
-
由 Vlad Yasevich 提交于
Currently the functionality to untag traffic on input resides as part of the vlan module and is build only when VLAN support is enabled in the kernel. When VLAN is disabled, the function vlan_untag() turns into a stub and doesn't really untag the packets. This seems to create an interesting interaction between VMs supporting checksum offloading and some network drivers. There are some drivers that do not allow the user to change tx-vlan-offload feature of the driver. These drivers also seem to assume that any VLAN-tagged traffic they transmit will have the vlan information in the vlan_tci and not in the vlan header already in the skb. When transmitting skbs that already have tagged data with partial checksum set, the checksum doesn't appear to be updated correctly by the card thus resulting in a failure to establish TCP connections. The following is a packet trace taken on the receiver where a sender is a VM with a VLAN configued. The host VM is running on doest not have VLAN support and the outging interface on the host is tg3: 10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243, offset 0, flags [DF], proto TCP (6), length 60) 10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect -> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val 4294837885 ecr 0,nop,wscale 7], length 0 10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244, offset 0, flags [DF], proto TCP (6), length 60) 10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect -> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val 4294838888 ecr 0,nop,wscale 7], length 0 This connection finally times out. I've only access to the TG3 hardware in this configuration thus have only tested this with TG3 driver. There are a lot of other drivers that do not permit user changes to vlan acceleration features, and I don't know if they all suffere from a similar issue. The patch attempt to fix this another way. It moves the vlan header stipping code out of the vlan module and always builds it into the kernel network core. This way, even if vlan is not supported on a virtualizatoin host, the virtual machines running on top of such host will still work with VLANs enabled. CC: Patrick McHardy <kaber@trash.net> CC: Nithin Nayak Sujir <nsujir@broadcom.com> CC: Michael Chan <mchan@broadcom.com> CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com> Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 8月, 2014 3 次提交
-
-
由 Ilya Dryomov 提交于
Determining ->last_piece based on the value of ->page_offset + length is incorrect because length here is the length of the entire message. ->last_piece set to false even if page array data item length is <= PAGE_SIZE, which results in invalid length passed to ceph_tcp_{send,recv}page() and causes various asserts to fire. # cat pages-cursor-init.sh #!/bin/bash rbd create --size 10 --image-format 2 foo FOO_DEV=$(rbd map foo) dd if=/dev/urandom of=$FOO_DEV bs=1M &>/dev/null rbd snap create foo@snap rbd snap protect foo@snap rbd clone foo@snap bar # rbd_resize calls librbd rbd_resize(), size is in bytes ./rbd_resize bar $(((4 << 20) + 512)) rbd resize --size 10 bar BAR_DEV=$(rbd map bar) # trigger a 512-byte copyup -- 512-byte page array data item dd if=/dev/urandom of=$BAR_DEV bs=1M count=1 seek=5 The problem exists only in ceph_msg_data_pages_cursor_init(), ceph_msg_data_pages_advance() does the right thing. The size_t cast is unnecessary. Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: NSage Weil <sage@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Jiri Benc 提交于
Commit 1d8faf48 ("net/core: Add VF link state control") added new attribute to IFLA_VF_INFO group in rtnl_fill_ifinfo but did not adjust size of the allocated memory in if_nlmsg_size/rtnl_vfinfo_size. As the result, we may trigger warnings in rtnl_getlink and similar functions when many VF links are enabled, as the information does not fit into the allocated skb. Fixes: 1d8faf48 ("net/core: Add VF link state control") Reported-by: NYulong Pei <ypei@redhat.com> Signed-off-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Niv Yehezkel 提交于
Since fib_lookup cannot return ESRCH no longer, checking for this error code is no longer neccesary. Signed-off-by: NNiv Yehezkel <executerx@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 8月, 2014 1 次提交
-
-
由 Julia Lawall 提交于
Convert a zero return value on error to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e1,e2; @@ ( if (\(ret < 0\|ret != 0\)) { ... return ret; } | ret = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-