- 20 4月, 2009 1 次提交
-
-
由 Florian Westphal 提交于
last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even though it is only used when a listening sockets syn queue is full. We can (ab)use rx_opt.ts_recent_stamp to store the same information; it is not used otherwise as long as a socket is in listen state. Move linger2 around to avoid splitting struct mtu_probe across cacheline boundary on 32 bit arches. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 4月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
inet_register_protosw() function is responsible for adding a new inet protocol into a global table (inetsw[]) that is used with RCU rules. As soon as the store of the pointer is done, other cpus might see this new protocol in inetsw[], so we have to make sure new protocol is ready for use. All pending memory updates should thus be committed to memory before setting the pointer. This is correctly done using rcu_assign_pointer() synchronize_net() is typically used at unregister time, after unsetting the pointer, to make sure no other cpu is still using the object we want to dismantle. Using it at register time is only adding an artificial delay that could hide a real bug, and this bug could popup if/when synchronize_rcu() can proceed faster than now. This saves about 13 ms on boot time on a HZ=1000 8 cpus machine ;) (4 calls to inet_register_protosw(), and about 3200 us per call) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
The removal of the SAME target accidentally removed one feature that is not available from the normal NAT targets so far, having multi-range mappings that use the same mapping for each connection from a single client. The current behaviour is to choose the address from the range based on source and destination IP, which breaks when communicating with sites having multiple addresses that require all connections to originate from the same IP address. Introduce a IP_NAT_RANGE_PERSISTENT option that controls whether the destination address is taken into account for selecting addresses. http://bugzilla.kernel.org/show_bug.cgi?id=12954Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 16 4月, 2009 1 次提交
-
-
由 Herbert Xu 提交于
It turns out that copying a 16-byte area at ~800k times a second can be really expensive :) This patch redesigns the frags GRO interface to avoid copying that area twice. The two disciples of the frags interface have been converted. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 4月, 2009 5 次提交
-
-
由 Patrick McHardy 提交于
Commit ea781f19 (netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and) get rid of call_rcu() was missing one conversion to the hlist_nulls functions, causing a crash when unloading conntrack helper modules. Reported-and-tested-by: NMariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Eric Dumazet 提交于
Latest tcpdump/libpcap triggers annoying messages because of high order page allocation failures (when lowmem exhausted or fragmented) These allocation errors are correctly handled so could be silent. [22660.208901] tcpdump: page allocation failure. order:5, mode:0xc0d0 [22660.208921] Pid: 13866, comm: tcpdump Not tainted 2.6.30-rc2 #170 [22660.208936] Call Trace: [22660.208950] [<c04e2b46>] ? printk+0x18/0x1a [22660.208965] [<c02760f7>] __alloc_pages_internal+0x357/0x460 [22660.208980] [<c0276251>] __get_free_pages+0x21/0x40 [22660.208995] [<c04cc835>] packet_set_ring+0x105/0x3d0 [22660.209009] [<c04ccd1d>] packet_setsockopt+0x21d/0x4d0 [22660.209025] [<c0270400>] ? filemap_fault+0x0/0x450 [22660.209040] [<c0449e34>] sys_setsockopt+0x54/0xa0 [22660.209053] [<c044b97f>] sys_socketcall+0xef/0x270 [22660.209067] [<c0202e34>] sysenter_do_call+0x12/0x26 Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
commit ca735b3a 'netfilter: use a linked list of loggers' introduced an array of list_head in "struct nf_logger", but forgot to initialize it in nf_log_register(). This resulted in oops when calling nf_log_unregister() at module unload time. Reported-and-tested-by: NMariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Acked-by: NEric Leblond <eric@inl.fr> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 David S. Miller 提交于
This reverts commit 244f46ae. Alan Cox did the research, and just like the other radio protocols zero-length frames have meaning because at the top level ROSE is X.25 PLP. So this zero-length filtering is invalid. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Since everybody has been focusing on baremetal GRO performance no one noticed when I added a bug that zapped gso_size for all GRO packets. This only gets picked up when you forward the skb out of an interface. Thanks to Mark Wagner for noticing this bug when testing kvm. Reported-by: NMark Wagner <mwagner@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2009 4 次提交
-
-
由 Yang Hongyang 提交于
After switch (rthdr->type) {...},the check below is completely useless.Because: if the type is 2,then hdrlen must be 2 and segments_left must be 1,clearly the check is redundant;if the type is not 2,then goto sticky_done,the check is useless too. Signed-off-by: NYang Hongyang <yanghy@cn.fujitsu.com> Reviewed-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ilpo Järvinen 提交于
A long-standing feature in tcp_init_metrics() is such that any of its goto reset prevents call to tcp_init_cwnd(). Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
When vlan acceleration is used on receive, the vlan tag is maintained outside of the skb data. The existing vlan tag match only works on TX path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Hi: gro: Normalise skb before bypassing GRO on netpoll VLAN path When we detect netpoll RX on the GRO VLAN path we bail out and call the normal VLAN receive handler. However, the packet needs to be normalised by calling eth_type_trans since that's what the normal path expects (normally the GRO path does the fixup). This patch adds the necessary call to vlan_gro_frags. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Thanks, Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 4月, 2009 3 次提交
-
-
由 Vlad Yasevich 提交于
Commit b2f5e7cd (ipv6: Fix conflict resolutions during ipv6 binding) introduced a regression where time-wait sockets were not treated correctly. This resulted in the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000062 IP: [<ffffffff805d7d61>] ipv4_rcv_saddr_equal+0x61/0x70 ... Call Trace: [<ffffffffa033847b>] ipv6_rcv_saddr_equal+0x1bb/0x250 [ipv6] [<ffffffffa03505a8>] inet6_csk_bind_conflict+0x88/0xd0 [ipv6] [<ffffffff805bb18e>] inet_csk_get_port+0x1ee/0x400 [<ffffffffa0319b7f>] inet6_bind+0x1cf/0x3a0 [ipv6] [<ffffffff8056d17c>] ? sockfd_lookup_light+0x3c/0xd0 [<ffffffff8056ed49>] sys_bind+0x89/0x100 [<ffffffff80613ea2>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff8020bf9b>] system_call_fastpath+0x16/0x1b Tested-by: NBrian Haley <brian.haley@hp.com> Tested-by: NEd Tomlinson <edt@aei.ca> Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
Add dev_put() after dev_get_by_index() to avoid leakage of device. Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
Currently netif_device_attach/detach are only stopping one queue. They should be starting and stopping all the queues on a given device. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 4月, 2009 11 次提交
-
-
由 Dan Carpenter 提交于
rdma_create_id() doesn't return NULL, only ERR_PTR(). Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
rdma_create_id() returns ERR_PTR() not null. Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
Use kmem_cache_zalloc instead of kmem_cache_alloc/memset. Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Huang Weiyi 提交于
Remove unused #include <version.h> in net/rds/af_rds.c. Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
Use the new function that is simpler and faster. Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
The first message to a remote node should prompt a new connection. Even an RDMA op via CMSG. Therefore move CMSG parsing to after connection establishment. Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
Putting the constant first is a supposed "best practice" that actually makes the code harder to read. Thanks to Roland Dreier for finding a bug in this "simple, obviously correct" patch. Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steve Wise 提交于
Fix hack that restricts the credit advertisement to 127. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steve Wise 提交于
The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to flow control. Otherwise the send worker will keep trying as opposed to sleeping until we unthrottle. Saves CPU. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
Had some lingering instances of _iw_ variable names from when the listen code was centralized into rdma_transport.c Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steve Wise 提交于
Currently the recv ring low water mark is 1/4 the depth. Performance measurements show that this limits iWARP throughput by flow controlling the rds-stress senders. Setting it to 1/2 seems to max the T3 performance. I tried even higher levels but that didn't help and it started to increase the rds thread cpu utilization. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 4月, 2009 2 次提交
-
-
由 Steffen Klassert 提交于
If an ipv4 packet (not locally generated with IP_DF flag not set) bigger than mtu size is supposed to go via a xfrm ipv6 tunnel, the packetsize check in xfrm4_tunnel_check_size() is omited and ipv6 drops the packet without sending a notice to the original sender of the ipv4 packet. Another issue is that ipv4 connection tracking does reassembling of incomming fragmented packets. If such a reassembled packet is supposed to go via a xfrm ipv6 tunnel it will be droped, even if the original sender did proper fragmentation. According to RFC 2473 (section 7) tunnel ipv6 packets resulting from the encapsulation of an original packet are considered as locally generated packets. If such a packet passed the checks in xfrm{4,6}_tunnel_check_size() fragmentation is allowed according to RFC 2473 (section 7.1/7.2). This patch sets skb->local_df in xfrm6_prepare_output() to achieve fragmentation in this case. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Adrian Bunk 提交于
This patch adds the missing MODULE_LICENSE("GPL"). Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2009 3 次提交
-
-
由 Pablo Neira Ayuso 提交于
This patch fixes a regression (introduced by myself in commit 19abb7b0: netfilter: ctnetlink: deliver events for conntracks changed from userspace) that results in an expectation re-insertion since __nf_ct_expect_check() may return 0 for expectation timer refreshing. This patch also removes a unnecessary refcount bump that pretended to avoid a possible race condition with event delivery and expectation timers (as said, not needed since we hold a reference to the object since until we finish the expectation setup). This also merges nf_ct_expect_related_report() and nf_ct_expect_related() which look basically the same. Reported-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Alex Riesen 提交于
It's plural, not LED_TRIGGERS. Signed-off-by: NAlex Riesen <fork0@users.sourceforge.net> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Eric Dumazet 提交于
Commit 78454473 (netfilter: iptables: lock free counters) broke ip6_tables by unconditionally returning ENOMEM in alloc_counters(), Reported-by: NGraham Murray <graham@gmurray.org.uk> Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 05 4月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
sock_alloc() currently uses following code to update sockets_in_use get_cpu_var(sockets_in_use)++; put_cpu_var(sockets_in_use); This translates to : c0436274: b8 01 00 00 00 mov $0x1,%eax c0436279: e8 42 40 df ff call c022a2c0 <add_preempt_count> c043627e: bb 20 4f 6a c0 mov $0xc06a4f20,%ebx c0436283: e8 18 ca f0 ff call c0342ca0 <debug_smp_processor_id> c0436288: 03 1c 85 60 4a 65 c0 add -0x3f9ab5a0(,%eax,4),%ebx c043628f: ff 03 incl (%ebx) c0436291: b8 01 00 00 00 mov $0x1,%eax c0436296: e8 75 3f df ff call c022a210 <sub_preempt_count> c043629b: 89 e0 mov %esp,%eax c043629d: 25 00 e0 ff ff and $0xffffe000,%eax c04362a2: f6 40 08 08 testb $0x8,0x8(%eax) c04362a6: 75 07 jne c04362af <sock_alloc+0x7f> c04362a8: 8d 46 d8 lea -0x28(%esi),%eax c04362ab: 5b pop %ebx c04362ac: 5e pop %esi c04362ad: c9 leave c04362ae: c3 ret c04362af: e8 cc 5d 09 00 call c04cc080 <preempt_schedule> c04362b4: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi c04362b8: eb ee jmp c04362a8 <sock_alloc+0x78> While percpu_add(sockets_in_use, 1) translates to a single instruction : c0436275: 64 83 05 20 5f 6a c0 addl $0x1,%fs:0xc06a5f20 Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 4月, 2009 1 次提交
-
-
由 Andy Adamson 提交于
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be returned. It is up to the NFSv4.1 client to resend only the operations that have not been processed. Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in nfsd4_proc_compound() only when NFSv4.1 Sessions are used. Note: this isn't an adequate solution on its own. It's acceptable as a way to get some minimal 4.1 up and working, but we're going to have to find a way to avoid returning DELAY in all common cases before 4.1 can really be considered ready. Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [nfsd41: reverse rq_nodeferral negative logic] Signed-off-by: NBenny Halevy <bhalevy@panasas.com> [sunrpc: initialize rq_usedeferral] Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NBenny Halevy <bhalevy@panasas.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
- 03 4月, 2009 2 次提交
-
-
由 Ilpo Järvinen 提交于
It seems that trivial reset of pcount to one was not sufficient in tcp_retransmit_skb. Multiple counters experience a positive miscount when skb's pcount gets lowered without the necessary adjustments (depending on skb's sacked bits which exactly), at worst a packets_out miscount can crash at RTO if the write queue is empty! Triggering this requires mss change, so bidir tcp or mtu probe or like. Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Tested-by: NUwe Bugla <uwe.bugla@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ilpo Järvinen 提交于
We need full-scale adjustment to fix a TCP miscount in the next patch, so just move it into a helper and call for that from the other places. Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 4月, 2009 4 次提交
-
-
由 Stephen Hemminger 提交于
GRO assumes that there is a one-to-one relationship between NAPI structure and network device. Some devices like sky2 share multiple devices on a single interrupt so only have one NAPI handler. Rather than split GRO from NAPI, just have GRO assume if device changes that it is a different flow. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Commit 78454473 (netfilter: iptables: lock free counters) forgot to disable BH in arpt_do_table(), ipt_do_table() and ip6t_do_table() Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem. Reported-and-bisected-by: NRoman Mindalev <r000n@r000n.net> Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Acked-by: NPatrick McHardy <kaber@trash.net> Acked-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
We have a 64bit value that needs to be set atomically. This is easy and quick on all 64bit archs, and can also be done on x86/32 with set_64bit() (uses cmpxchg8b). However other 32b archs don't have this. I actually changed this to the current state in preparation for mainline because the old way (using a spinlock on 32b) resulted in unsightly #ifdefs in the code. But obviously, being correct takes precedence. Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andy Grover 提交于
This fixes a bug where a connection was unexpectedly not on *any* list while being destroyed. It also cleans up some code duplication and regularizes some function names. * Grab appropriate lock in conn_free() and explain in comment * Ensure via locking that a conn is never not on either a dev's list or the nodev list * Add rds_xx_remove_conn() to match rds_xx_add_conn() * Make rds_xx_add_conn() return void * Rename remove_{,nodev_}conns() to destroy_{,nodev_}conns() and unify their implementation in a helper function * Document lock ordering as nodev conn_lock before dev_conn_lock Reported-by: NYosef Etigin <yosefe@voltaire.com> Signed-off-by: NAndy Grover <andy.grover@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-