- 05 11月, 2014 2 次提交
-
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2014 2 次提交
-
-
由 Eric Dumazet 提交于
Yaogong replaces TCP out of order receive queue by an RB tree. As netem already does a private skb->{next/prev/tstamp} union with a 'struct rb_node', lets do this in a cleaner way. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
net_rx_action() can mask irqs a single time to transfert sd->poll_list into a private list, for a very short duration. Then, napi_complete() can avoid masking irqs again, and net_rx_action() only needs to mask irq again in slow path. This patch removes 2 couples of irq mask/unmask per typical NAPI run, more if multiple napi were triggered. Note this also allows to give control back to caller (do_softirq()) more often, so that other softirq handlers can be called a bit earlier, or ksoftirqd can be wakeup earlier under pressure. This was developed while testing an alternative to RX interrupt mitigation to reduce latencies while keeping or improving GRO aggregation on fast NIC. Idea is to test napi->gro_list at the end of a napi->poll() and reschedule one NAPI poll, but after servicing a full round of softirqs (timers, TX, rcu, ...). This will be allowed only if softirq is currently serviced by idle task or ksoftirqd, and resched not needed. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2014 3 次提交
-
-
由 Guenter Roeck 提交于
If a driver supports reading EEPROM but no EEPROM is installed in the system, the driver's get_eeprom_len function returns 0. ethtool will subsequently try to read that zero-length EEPROM anyway. If the driver does not support EEPROM access at all, this operation will return -EOPNOTSUPP. If the driver does support EEPROM access but no EEPROM is installed, the operation will return -EINVAL. Return -EOPNOTSUPP in both cases for consistency. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
Kconfig already allows mpls to be built as module. Following patch fixes Makefile to do same. CC: Simon Horman <simon.horman@netronome.com> Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
mpls gso handler needs to pull skb after segmenting skb. CC: Simon Horman <simon.horman@netronome.com> Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2014 24 次提交
-
-
由 Pablo Neira Ayuso 提交于
Restrict the reject expression to the prerouting and input bridge hooks. If we allow this to be used from forward or any other later bridge hook, if the frame is flooded to several ports, we'll end up sending several reject packets, one per cloned packet. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
If the packet is received via the bridge stack, this cannot reject packets from the IP stack. This adds functions to build the reject packet and send it from the bridge stack. Comments and assumptions on this patch: 1) Validate the IPv4 and IPv6 headers before further processing, given that the packet comes from the bridge stack, we cannot assume they are clean. Truncated packets are dropped, we follow similar approach in the existing iptables match/target extensions that need to inspect layer 4 headers that is not available. This also includes packets that are directed to multicast and broadcast ethernet addresses. 2) br_deliver() is exported to inject the reject packet via bridge localout -> postrouting. So the approach is similar to what we already do in the iptables reject target. The reject packet is sent to the bridge port from which we have received the original packet. 3) The reject packet is forged based on the original packet. The TTL is set based on sysctl_ip_default_ttl for IPv4 and per-net ipv6.devconf_all hoplimit for IPv6. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_ip6hdr_put(): to build the IPv6 header. * nf_reject_ip6_tcphdr_put(): to build the TCP header. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_iphdr_put(): to build the IPv4 header. * nf_reject_ip_tcphdr_put(): to build the TCP header. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Fixes: 36d2af59 ("netfilter: nf_tables: allow to filter from prerouting and postrouting") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Ben Hutchings 提交于
UFO is now disabled on all drivers that work with virtio net headers, but userland may try to send UFO/IPv6 packets anyway. Instead of sending with ID=0, we should select identifiers on their behalf (as we used to). Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Fixes: 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Some drivers are unable to perform TX completions in a bound time. They instead call skb_orphan() Problem is skb_fclone_busy() has to detect this case, otherwise we block TCP retransmits and can freeze unlucky tcp sessions on mostly idle hosts. Signed-off-by: NEric Dumazet <edumazet@google.com> Fixes: 1f3279ae ("tcp: avoid retransmits of TCP packets hanging in host queues") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
Challenge ACK is described in RFC 5961, fix typo. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Currently, skb_inner_network_header is used but this does not account for Ethernet header for ETH_P_TEB. Use skb_inner_mac_header which handles TEB and also should work with IP encapsulation in which case inner mac and inner network headers are the same. Tested: Ran TCP_STREAM over GRE, worked as expected. Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michele Baldessari 提交于
Fixes checkpatch warning: "WARNING: Prefer seq_puts to seq_printf" Signed-off-by: NMichele Baldessari <michele@acksyn.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michele Baldessari 提交于
It is often quite helpful to be able to know the state of a transport outside of the application itself (for troubleshooting purposes or for monitoring purposes). Add it under /proc/net/sctp/remaddr. Signed-off-by: NMichele Baldessari <michele@acksyn.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Cavallari 提交于
If we cache them, the kernel will reuse them, independently of whether forwarding is enabled or not. Which means that if forwarding is disabled on the input interface where the first routing request comes from, then that unreachable result will be cached and reused for other interfaces, even if forwarding is enabled on them. The opposite is also true. This can be verified with two interfaces A and B and an output interface C, where B has forwarding enabled, but not A and trying ip route get $dst iif A from $src && ip route get $dst iif B from $src Signed-off-by: NNicolas Cavallari <nicolas.cavallari@green-communications.fr> Reviewed-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Only count packets that failed cookie-authentication. We can get SYNCOOKIESFAILED > 0 while we never even sent a single cookie. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Andriyanov 提交于
The fallback device is in ipv6 mode by default. The mode can not be changed in runtime, so there is no way to decapsulate ip4in6 packets coming from various sources without creating the specific tunnel ifaces for each peer. This allows to update the fallback tunnel device, but only the mode could be changed. Usual command should work for the fallback device: `ip -6 tun change ip6tnl0 mode any` The fallback device can not be hidden from the packet receiver as a regular tunnel, but there is no need for synchronization as long as we do single assignment. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NAlexey Andriyanov <alan@al-an.info> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Frederick 提交于
Do assignment before if condition and test !skb like in rawv6_recvmsg() Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Frederick 提交于
remove __inline__ / inline and let compiler decide what to do with static functions Inspired-by: N"David S. Miller" <davem@davemloft.net> Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Frederick 提交于
Apply commit e0f36310 ("ipx: remove unnecessary casting on ntohl") to all seq_printf/08lX Inspired-by: N"David S. Miller" <davem@davemloft.net> Inspired-by: NJoe Perches <joe@perches.com> Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guenter Roeck 提交于
Add support for reading switch registers with 'ethtool -d'. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guenter Roeck 提交于
On some chips it is possible to access the switch eeprom. Add infrastructure support for it. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guenter Roeck 提交于
Some switches provide chip temperature data. Add support for reporting it through the hwmon subsystem. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guenter Roeck 提交于
Setting skb->protocol to a private protocol type may result in warning messages such as e1000e 0000:00:19.0 em1: checksum_partial proto=dada! This happens if the L3 protocol is IP or IPv6 and skb->ip_summed is set to CHECKSUM_PARTIAL. Looking through the code, it appears that changing skb->protocol for transmitted packets is not necessary and may actually be harmful. For example, it prevents purposely unmodified (from a DSA perspective) network drivers from properly setting up their transmit checksum offload pointers since they inspect skb->protocol to set up the IPv4 header or IPv6 header pointers. So don't unnecessarily change the protocol field. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 10月, 2014 9 次提交
-
-
由 Nicolas Dichtel 提交于
In neigh_parms_release() we loop over all entries to find the entry given in argument and being able to remove it from the list. By using a double linked list, we can avoid this loop. Here are some numbers with 30 000 dummy interfaces configured: Before the patch: $ time rmmod dummy real 2m0.118s user 0m0.000s sys 1m50.048s After the patch: $ time rmmod dummy real 1m9.970s user 0m0.000s sys 0m47.976s Suggested-by: NThierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
napi_schedule() can be called from any context and has to mask hard irqs. Add a variant that can only be called from hard interrupts handlers or when irqs are already masked. Many NIC drivers can use it from their hard IRQ handler instead of generic variant. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
The WARN_ON in inet_evict_bucket can be triggered by a valid case: inet_frag_kill and inet_evict_bucket can be running in parallel on the same queue which means that there has been at least one more ref added by a previous inet_frag_find call, but inet_frag_kill can delete the timer before inet_evict_bucket which will cause the WARN_ON() there to trigger since we'll have refcnt!=1. Now, this case is valid because the queue is being "killed" for some reason (removed from the chain list and its timer deleted) so it will get destroyed in the end by one of the inet_frag_put() calls which reaches 0 i.e. refcnt is still valid. CC: Florian Westphal <fw@strlen.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McLean <chutzpah@gentoo.org> Fixes: b13d3cbf ("inet: frag: move eviction of queues to work queue") Reported-by: NPatrick McLean <chutzpah@gentoo.org> Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
When the evictor is running it adds some chosen frags to a local list to be evicted once the chain lock has been released but at the same time the *frag_queue can be running for some of the same queues and it may call inet_frag_kill which will wait on the chain lock and will then delete the queue from the wrong list since it was added in the eviction one. The fix is simple - check if the queue has the evict flag set under the chain lock before deleting it, this is safe because the evict flag is set only under that lock and having the flag set also means that the queue has been detached from the chain list, so no need to delete it again. An important note to make is that we're safe w.r.t refcnt because inet_frag_kill and inet_evict_bucket will sync on the del_timer operation where only one of the two can succeed (or if the timer is executing - none of them), the cases are: 1. inet_frag_kill succeeds in del_timer - then the timer ref is removed, but inet_evict_bucket will not add this queue to its expire list but will restart eviction in that chain 2. inet_evict_bucket succeeds in del_timer - then the timer ref is kept until the evictor "expires" the queue, but inet_frag_kill will remove the initial ref and will set INET_FRAG_COMPLETE which will make the frag_expire fn just to remove its ref. In the end all of the queue users will do an inet_frag_put and the one that reaches 0 will free it. The refcount balance should be okay. CC: Florian Westphal <fw@strlen.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McLean <chutzpah@gentoo.org> Fixes: b13d3cbf ("inet: frag: move eviction of queues to work queue") Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Reported-by: NPatrick McLean <chutzpah@gentoo.org> Tested-by: NPatrick McLean <chutzpah@gentoo.org> Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com> Reviewed-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Erik Kline 提交于
Add a sysctl that causes an interface's optimistic addresses to be considered equivalent to other non-deprecated addresses for source address selection purposes. Preferred addresses will still take precedence over optimistic addresses, subject to other ranking in the source address selection algorithm. This is useful where different interfaces are connected to different networks from different ISPs (e.g., a cell network and a home wifi network). The current behaviour complies with RFC 3484/6724, and it makes sense if the host has only one interface, or has multiple interfaces on the same network (same or cooperating administrative domain(s), but not in the multiple distinct networks case. For example, if a mobile device has an IPv6 address on an LTE network and then connects to IPv6-enabled wifi, while the wifi IPv6 address is undergoing DAD, IPv6 connections will try use the wifi default route with the LTE IPv6 address, and will get stuck until they time out. Also, because optimistic nodes can receive frames, issue an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC flag appropriately set). A second RTM_NEWADDR is sent if DAD completes (the address flags have changed), otherwise an RTM_DELADDR is sent. Also: add an entry in ip-sysctl.txt for optimistic_dad. Signed-off-by: NErik Kline <ek@google.com> Acked-by: NLorenzo Colitti <lorenzo@google.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While testing upcoming Yaogong patch (converting out of order queue into an RB tree), I hit the max reordering level of linux TCP stack. Reordering level was limited to 127 for no good reason, and some network setups [1] can easily reach this limit and get limited throughput. Allow a new max limit of 300, and add a sysctl to allow admins to even allow bigger (or lower) values if needed. [1] Aggregation of links, per packet load balancing, fabrics not doing deep packet inspections, alternative TCP congestion modules... Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Toshiaki Makita 提交于
This patch generalizes commit d6a4a104 ("tcp: GSO should be TSQ friendly") to protocols using skb_set_owner_w() TCP uses its own destructor (tcp_wfree) and needs a more complex scheme as explained in commit 6ff50cd5 ("tcp: gso: do not generate out of order packets") This allows UDP sockets using UFO to get proper backpressure, thus avoiding qdisc drops and excessive cpu usage. Here are performance test results (macvlan on vlan): - Before # netperf -t UDP_STREAM ... Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 65507 60.00 144096 1224195 1258.56 212992 60.00 51 0.45 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.23 0.00 25.26 0.08 0.00 74.43 - After # netperf -t UDP_STREAM ... Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 65507 60.00 109593 0 957.20 212992 60.00 109593 957.20 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.18 0.00 8.38 0.02 0.00 91.43 [edumazet] Rewrote patch and changelog. Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lubomir Rintel 提交于
NetworkManager might want to know that it changed when the router advertisement arrives. Signed-off-by: NLubomir Rintel <lkundrak@v3.sk> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
Cc: Vijay Subramanian <vijaynsu@cisco.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com>
-