- 14 9月, 2014 2 次提交
-
-
由 John Fastabend 提交于
rcu'ify tcf_proto this allows calling tc_classify() without holding any locks. Updaters are protected by RTNL. This patch prepares the core net_sched infrastracture for running the classifier/action chains without holding the qdisc lock however it does nothing to ensure cls_xxx and act_xxx types also work without locking. Additional patches are required to address the fall out. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
Add __rcu notation to qdisc handling by doing this we can make smatch output more legible. And anyways some of the cases should be using rcu_dereference() see qdisc_all_tx_empty(), qdisc_tx_chainging(), and so on. Also *wake_queue() API is commonly called from driver timer routines without rcu lock or rtnl lock. So I added rcu_read_lock() blocks around netif_wake_subqueue and netif_tx_wake_queue. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2014 3 次提交
-
-
由 Scott Wood 提交于
Commit 2abb7cdc ("udp: Add support for doing checksum unnecessary conversion") caused napi_gro_cb structs with the "flush" field zero to take the "udp_gro_receive" path rather than the "set flush to 1" path that they would previously take. As a result I saw booting from an NFS root hang shortly after starting userspace, with "server not responding" messages. This change to the handling of "flush == 0" packets appears to be incidental to the goal of adding new code in the case where skb_gro_checksum_validate_zero_check() returns zero. Based on that and the fact that it breaks things, I'm assuming that it is unintentional. Fixes: 2abb7cdc ("udp: Add support for doing checksum unnecessary conversion") Cc: Tom Herbert <therbert@google.com> Signed-off-by: NScott Wood <scottwood@freescale.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
There is a possible issue with the use, or lack thereof of sk_refcnt and sk_wmem_alloc in the wifi ack status functionality. Specifically if a socket were to request acknowledgements, and the socket were to have sk_refcnt drop to 0 resulting in it waiting on sk_wmem_alloc to reach 0 it would be possible to have sock_queue_err_skb orphan the last buffer, resulting in __sk_free being called on the socket. After this the buffer is enqueued on sk_error_queue, however the queue has already been flushed resulting in at least a memory leak, if not a data corruption. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Acked-by: NJohannes Berg <johannes@sipsolutions.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This change adds some documentation to the call skb_clone_sk. This is meant to help clarify the purpose of the function for other developers. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 9月, 2014 5 次提交
-
-
由 Erik Hugne 提交于
This fixes the following sparse warnings: sparse: symbol 'tipc_update_nametbl' was not declared. Should it be static? Also, the function is changed to return bool upon success, rather than a potentially freed pointer. Signed-off-by: NErik Hugne <erik.hugne@ericsson.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the more common pr_warn. Other miscellanea: o Coalesce formats o Realign arguments Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the more common pr_warn. Coalesce formats. Realign arguments. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the more common pr_warn. Realign arguments. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the more common pr_warn. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 9月, 2014 21 次提交
-
-
由 Tom Herbert 提交于
Add ipv6_gro_receive and ipv6_gro_complete to sit_offload to support GRO. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Add inet_gro_receive and inet_gro_complete to ipip_offload to support GRO. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
In TCP gro we check flush_id which is derived from the IP identifier. In IPv4 gro path the flush_id is set with the expectation that every matched packet increments IP identifier. In IPv6, the flush_id is never set and thus is uinitialized. What's worse is that in IPv6 over IPv4 encapsulation, the IP identifier is taken from the outer header which is currently not incremented on every packet for Linux stack, so GRO in this case never matches packets (identifier is not increasing). This patch clears flush_id for every time for a matched packet in IPv6 gro_receive. We need to do this each time to overwrite the setting that would be done in IPv4 gro_receive per the outer header in IPv6 over Ipv4 encapsulation. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
net/ipv4/udp_offload.c:339:5: warning: symbol 'udp4_gro_complete' was not declared. Should it be static? Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Fixes: 57c67ff4 ("udp: additional GRO support") Acked-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
net/core/net_namespace.c:227:18: warning: incorrect type in argument 1 (different address spaces) net/core/net_namespace.c:227:18: expected void const *<noident> net/core/net_namespace.c:227:18: got struct net_generic [noderef] <asn:4>*gen We can use rcu_access_pointer() here as read-side access to the pointer was removed at least one grace period ago. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
net/ipv6/udp_offload.c:159:5: warning: symbol 'udp6_gro_complete' was not declared. Should it be static? Signed-off-by: NEric Dumazet <edumazet@google.com> Fixes: 57c67ff4 ("udp: additional GRO support") Cc: Tom Herbert <therbert@google.com> Acked-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Remove one sparse warning : net/ipv4/ip_sockglue.c:328:22: warning: incorrect type in assignment (different address spaces) net/ipv4/ip_sockglue.c:328:22: expected struct ip_ra_chain [noderef] <asn:4>*next net/ipv4/ip_sockglue.c:328:22: got struct ip_ra_chain *[assigned] ra And replace one rcu_assign_ptr() by RCU_INIT_POINTER() where applicable. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
It's not used anywhere, so just remove these. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Few packets have timestamping enabled. Exit sock_tx_timestamp quickly in this common case. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
not used anymore since ddecf0f4 (net_sched: sfq: add optional RED on top of SFQ). Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Reported by Mikulas Patocka, kmemcheck currently barks out a false positive since we don't have special kmemcheck annotation for bitfields used in bpf_prog structure. We currently have jited:1, len:31 and thus when accessing len while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that we're reading uninitialized memory. As we don't need the whole bit universe for pages member, we can just split it to u16 and use a bool flag for jited instead of a bitfield. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Alexander Duyck reported high false sharing on dst refcount in tcp stack when prequeue is used. prequeue is the mechanism used when a thread is blocked in recvmsg()/read() on a TCP socket, using a blocking model rather than select()/poll()/epoll() non blocking one. We already try to use RCU in input path as much as possible, but we were forced to take a refcount on the dst when skb escaped RCU protected region. When/if the user thread runs on different cpu, dst_release() will then touch dst refcount again. Commit 09316255 (tcp: force a dst refcount when prequeue packet) was an example of a race fix. It turns out the only remaining usage of skb->dst for a packet stored in a TCP socket prequeue is IP early demux. We can add a logic to detect when IP early demux is probably going to use skb->dst. Because we do an optimistic check rather than duplicate existing logic, we need to guard inet_sk_rx_dst_set() and inet6_sk_rx_dst_set() from using a NULL dst. Many thanks to Alexander for providing a nice bug report, git bisection, and reproducer. Tested using Alexander script on a 40Gb NIC, 8 RX queues. Hosts have 24 cores, 48 hyper threads. echo 0 >/proc/sys/net/ipv4/tcp_autocorking for i in `seq 0 47` do for j in `seq 0 2` do netperf -H $DEST -t TCP_STREAM -l 1000 \ -c -C -T $i,$i -P 0 -- \ -m 64 -s 64K -D & done done Before patch : ~6Mpps and ~95% cpu usage on receiver After patch : ~9Mpps and ~35% cpu usage on receiver. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
Change the date type of error status from u64 to atomic_long_t, and use atomic operation, then remove the lock which is used to protect the error status. The operation of atomic maybe faster than spin lock. Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: NLi RongQing <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rami Rosen 提交于
This patch removes an unncessary check in the br_afspec() method of br_netlink.c. Signed-off-by: NRami Rosen <ramirose@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Allow rtnetlink users to set bridge master info via IFLA_INFO_DATA attr This initial part implements forward_delay, hello_time, max_age options. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Allow rtnetlink users to get bridge master info in IFLA_INFO_DATA attr This initial part implements forward_delay, hello_time, max_age options. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Allow rtnetlink users to set port info via IFLA_INFO_SLAVE_DATA attr Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Allow rtnetlink users to get port info in IFLA_INFO_SLAVE_DATA attr Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The thing is that netdev_master_upper_dev_link calls call_netdevice_notifiers(NETDEV_CHANGEUPPER, dev). That generates rtnl link message and during that, rtnl_link_ops->fill_slave_info is called. But with current ordering, rx_handler and IFF_BRIDGE_PORT are not set yet so there would have to be check for that in fill_slave_info callback. Resolve this by reordering to similar what bonding and team does to avoid the check. Also add removal of IFF_BRIDGE_PORT flag into error path. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vincent Bernat 提交于
net.ipv4.ip_nonlocal_bind sysctl was global to all network namespaces. This patch allows to set a different value for each network namespace. Signed-off-by: NVincent Bernat <vincent@bernat.im> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 9月, 2014 9 次提交
-
-
由 Arturo Borrero 提交于
The nft_masq expression is intended to perform NAT in the masquerade flavour. We decided to have the masquerade functionality in a separated expression other than nft_nat. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv6; a similar patch exists to handle IPv4. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv4; a similar patch follows to handle IPv6. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Nicolas Dichtel 提交于
This is already done for x_tables (family AF_INET and AF_INET6), let's do it for AF_BRIDGE also. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
Both SNAT and DNAT (and the upcoming masquerade) can have additional configuration parameters, such as port randomization and NAT addressing persistence. We can cover these scenarios by simply adding a flag attribute for userspace to fill when needed. The flags to use are defined in include/uapi/linux/netfilter/nf_nat.h: NF_NAT_RANGE_MAP_IPS NF_NAT_RANGE_PROTO_SPECIFIED NF_NAT_RANGE_PROTO_RANDOM NF_NAT_RANGE_PERSISTENT NF_NAT_RANGE_PROTO_RANDOM_FULLY NF_NAT_RANGE_PROTO_RANDOM_ALL The caller must take care of not messing up with the flags, as they are added unconditionally to the final resulting nf_nat_range. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This patch extend the NFT_MSG_DELTABLE call to support flushing the entire ruleset. The options now are: * No family speficied, no table specified: flush all the ruleset. * Family specified, no table specified: flush all tables in the AF. * Family specified, table specified: flush the given table. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arturo Borrero 提交于
This patch refactor the code to schedule objects deletion. They are useful in follow-up patches. In order to be able to use these new helper functions in all the code, they are placed in the top of the file, with all the dependant functions and symbols. nft_rule_disactivate_next has been renamed to nft_rule_deactivate. Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Bojan Prtvar 提交于
The skb_find_text() accepts uninitialized textsearch state variable. Signed-off-by: NBojan Prtvar <prtvar.b@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
Use union to reserve the required stack space for sockopt data which is less than the currently hardcoded value of 128. Now the tables for commands should be more readable. The checks added for readability are optimized by compiler, others warn at compile time if command uses too much stack or exceeds the storage of set_arglen and get_arglen. As Dan Carpenter points out, we can run for unprivileged user, so we can silent some error messages. Signed-off-by: NJulian Anastasov <ja@ssi.bg> CC: Dan Carpenter <dan.carpenter@oracle.com> CC: Andrey Utkin <andrey.krieger.utkin@gmail.com> CC: David Binderman <dcb314@hotmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-