- 25 11月, 2008 1 次提交
-
-
由 Eric Dumazet 提交于
We can reduce pressure on dst entry refcount that slowdown UDP transmit path on SMP machines. This pressure is visible on RTP servers when delivering content to mediagateways, especially big ones, handling thousand of streams. Several cpus send UDP frames to the same destination, hence use the same dst entry. This patch makes ip_append_data() eventually steal the refcount its callers had to take on the dst entry. This doesnt avoid all refcounting, but still gives speedups on SMP, on UDP/RAW transmit path Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2008 4 次提交
-
-
由 Eric Dumazet 提交于
The rule of calling sock_prot_inuse_add() is that BHs must be disabled. Some new calls were added where this was not true and this tiggers warnings as reported by Ilpo. Fix this by adding explicit BH disabling around those call sites, or moving sock_prot_inuse_add() call inside an existing BH disabled section. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Dobriyan 提交于
dev_net_set() should be the very first thing after alloc_netdev(). "ndo_" changes turned simple assignment (which is OK to do before netns assignment) into quite non-trivial operation (which is not OK, init_net was used). This leads to incomplete initialisation of tunnel device in netns. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/class/net/lo/operstate Pid: 10, comm: netns Not tainted (2.6.28-rc6 #1) EIP: 0060:[<c02efdb5>] EFLAGS: 00010246 CPU: 0 EIP is at ip6_tnl_exit_net+0x37/0x4f EAX: 00000000 EBX: 00000020 ECX: 00000000 EDX: 00000003 ESI: c5caef30 EDI: c782bbe8 EBP: c7909f50 ESP: c7909f48 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process netns (pid: 10, ti=c7908000 task=c7905780 task.ti=c7908000) Stack: c03e75e0 c7390bc8 c7909f60 c0245448 c7390bd8 c7390bf0 c7909fa8 c012577a 00000000 00000002 00000000 c0125736 c782bbe8 c7909f90 c0308fe3 c782bc04 c7390bd4 c0245406 c084b718 c04f0770 c03ad785 c782bbe8 c782bc04 c782bc0c Call Trace: [<c0245448>] ? cleanup_net+0x42/0x82 [<c012577a>] ? run_workqueue+0xd6/0x1ae [<c0125736>] ? run_workqueue+0x92/0x1ae [<c0308fe3>] ? schedule+0x275/0x285 [<c0245406>] ? cleanup_net+0x0/0x82 [<c0125ae1>] ? worker_thread+0x81/0x8d [<c0128344>] ? autoremove_wake_function+0x0/0x33 [<c0125a60>] ? worker_thread+0x0/0x8d [<c012815c>] ? kthread+0x39/0x5e [<c0128123>] ? kthread+0x0/0x5e [<c0103b9f>] ? kernel_thread_helper+0x7/0x10 Code: db e8 05 ff ff ff 89 c6 e8 dc 04 f6 ff eb 08 8b 40 04 e8 38 89 f5 ff 8b 44 9e 04 85 c0 75 f0 43 83 fb 20 75 f2 8b 86 84 00 00 00 <8b> 40 04 e8 1c 89 f5 ff e8 98 04 f6 ff 89 f0 e8 f8 63 e6 ff 5b EIP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f SS:ESP 0068:c7909f48 ---[ end trace 6c2f2328fccd3e0c ]--- Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This is the last step to be able to perform full RCU lookups in __inet_lookup() : After established/timewait tables, we add RCU lookups to listening hash table. The only trick here is that a socket of a given type (TCP ipv4, TCP ipv6, ...) can now flight between two different tables (established and listening) during a RCU grace period, so we must use different 'nulls' end-of-chain values for two tables. We define a large value : #define LISTENING_NULLS_BASE (1U << 29) So that slots in listening table are guaranteed to have different end-of-chain values than slots in established table. A reader can still detect it finished its lookup in the right chain. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
1) Use eq_net() in inet_netns_ok() to speedup socket creation if !CONFIG_NET_NS 2) Reorder the tests about inet_ehash_secret generation (once only) Use the unlikely() macro when testing if inet_ehash_secret already generated. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 11月, 2008 1 次提交
-
-
由 David S. Miller 提交于
They are no longer a rwlocks. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 11月, 2008 4 次提交
-
-
由 Eric Dumazet 提交于
Now TCP & DCCP use RCU lookups, we can convert ehash rwlocks to spinlocks. /proc/net/tcp and other seq_file 'readers' can safely be converted to 'writers'. This should speedup writers, since spin_lock()/spin_unlock() only use one atomic operation instead of two for write_lock()/write_unlock() Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Convert ipgre tunnel to netdevice ops. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Convert to network device ops. Needed to change to directly call the init routine since two sides share same ops. In the process found by inspection a device ref count leak if register_netdevice failed. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Convert to new network device ops interface. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2008 8 次提交
-
-
由 Harvey Harrison 提交于
Fixes sparse warnings: net/ipv4/ip_sockglue.c:146:15: warning: incorrect type in assignment (different base types) net/ipv4/ip_sockglue.c:146:15: expected restricted __be16 [assigned] [usertype] sin_port net/ipv4/ip_sockglue.c:146:15: got unsigned short [unsigned] [short] [usertype] <noident> net/ipv4/ip_sockglue.c:130:6: warning: symbol 'ip_cmsg_recv_dstaddr' was not declared. Should it be static? Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Balazs Scheidler 提交于
inet_sk_rebuild_header() does a new route lookup if the dst_entry associated with a socket becomes stale. However inet_sk_rebuild_header() didn't use struct flowi->flags, causing the route lookup to fail for foreign-bound IP_TRANSPARENT sockets, causing an error state to be set for the sockets in question. Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Balazs Scheidler 提交于
udp_sendmsg() didn't fill struct flowi->flags, which means that the route lookup would fail for non-local IPs even if the IP_TRANSPARENT sockopt was set. This prevents sendto() to work properly for UDP sockets, whereas bind(foreign-ip) + connect() + send() worked fine. Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch prepares RCU migration of listening_hash table for TCP/DCCP protocols. listening_hash table being small (32 slots per protocol), we add a spinlock for each slot, instead of a single rwlock for whole table. This should reduce hold time of readers, and writers concurrency. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Convert to net_device_ops function table pointer for ioctl. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
The first argument to csum_partial is const void * casts to char/u8 * are not necessary Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin Thery 提交于
Similarly to IPv6 ip6_mr_init() (fixed last week), the order of cleanup operations in the error/exit section of ip_mr_init() is completely inversed. It should be the other way around. Also a del_timer() is missing in the error path. I should have guessed last week that this same error existed in ipmr.c too, as ip6mr.c is largely inspired by ipmr.c. Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2008 4 次提交
-
-
由 Eric Dumazet 提交于
RCU was added to UDP lookups, using a fast infrastructure : - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the price of call_rcu() at freeing time. - hlist_nulls permits to use few memory barriers. This patch uses same infrastructure for TCP/DCCP established and timewait sockets. Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications using short lived TCP connections. A followup patch, converting rwlocks to spinlocks will even speedup this case. __inet_lookup_established() is pretty fast now we dont have to dirty a contended cache line (read_lock/read_unlock) Only established and timewait hashtable are converted to RCU (bind table and listen table are still using traditional locking) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Balazs Scheidler 提交于
In case UDP traffic is redirected to a local UDP socket, the originally addressed destination address/port cannot be recovered with the in-kernel tproxy. This patch adds an IP_RECVORIGDSTADDR sockopt that enables a IP_ORIGDSTADDR ancillary message in recvmsg(). This ancillary message contains the original destination address/port of the packet being received. Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Greear 提交于
Ben Greear wrote: > I have 500 mac-vlans on a system talking to 500 other > mac-vlans. My problem is that the arp-table gets extremely > huge because every time an arp-request comes in on all mac-vlans, > a stale arp entry is added for each mac-vlan. I have filtering > turned on, but that doesn't help because the neigh_event_ns call > below will cause a stale neighbor entry to be created regardless > of whether a replay will be sent or not. > Maybe the neigh_event code should be below the checks for dont_send, > and only create check neigh_event_ns if we are !dont_send? The attached patch makes it work much better for me. The patch will cause the code to NOT create a stale neighbor entry if we are not going to respond to the ARP request. The old code *would* create a stale entry even if we are not going to respond. Signed-off-by: NBen Greear <greearb@candelatech.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 11月, 2008 1 次提交
-
-
由 Alexey Dobriyan 提交于
Failure to pass netns_ok check is SILENT, except some MIB counter is incremented somewhere. And adding "netns_ok = 1" (after long head-scratching session) is usually the last step in making some protocol netns-ready... Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 11月, 2008 3 次提交
-
-
由 Doug Leith 提交于
This patch fixes a minor bug in tcp_htcp.c which has been highlighted by Lachlan Andrew and Lawrence Stewart. Currently, the time since the last congestion event, which is stored in variable last_cong, is reset whenever there is a state change into TCP_CA_Open. This includes transitions of the type TCP_CA_Open->TCP_CA_Disorder->TCP_CA_Open which are not associated with backoff of cwnd. The patch changes last_cong to be updated only on transitions into TCP_CA_Open that occur after experiencing the congestion-related states TCP_CA_Loss, TCP_CA_Recovery, TCP_CA_CWR. Signed-off-by: NDoug Leith <doug.leith@nuim.ie> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We can shrink size of "struct inet_bind_bucket" by 50%, using read_pnet() and write_pnet() Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Dobriyan 提交于
Unused after kmem_cache_zalloc() conversion. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2008 1 次提交
-
-
由 Eric Dumazet 提交于
icmpmsg_put() can happily corrupt kernel memory, using a static table and forgetting to reset an array index in a loop. Remove the static array since its not safe without proper locking. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2008 2 次提交
-
-
由 David S. Miller 提交于
Vito Caputo noticed that tcp_recvmsg() returns immediately from partial reads when MSG_PEEK is used. In particular, this means that SO_RCVLOWAT is not respected. Simply remove the test. And this matches the behavior of several other systems, including BSD. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andreas Steffen 提交于
While adding MIGRATE support to strongSwan, Andreas Steffen noticed that the selectors provided in XFRM_MSG_ACQUIRE have their family field uninitialized (those in MIGRATE do have their family set). Looking at the code, this is because the af-specific init_tempsel() (called via afinfo->init_tempsel() in xfrm_init_tempsel()) do not set the value. Reported-by: NAndreas Steffen <andreas.steffen@strongswan.org> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NArnaud Ebalard <arno@natisbad.org>
-
- 04 11月, 2008 1 次提交
-
-
由 Alexey Dobriyan 提交于
I want to compile out proc_* and sysctl_* handlers totally and stub them to NULL depending on config options, however usage of & will prevent this, since taking adress of NULL pointer will break compilation. So, drop & in front of every ->proc_handler and every ->strategy handler, it was never needed in fact. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 11月, 2008 10 次提交
-
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jianjun Kong 提交于
Signed-off-by: NJianjun Kong <jianjun@zeuux.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-