- 11 10月, 2007 40 次提交
-
-
由 Denis V. Lunev 提交于
This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
There are currently two ways to determine whether the netlink socket is a kernel one or a user one. This patch creates a single inline call for this purpose and unifies all the calls in the af_netlink.c No similar calls are found outside af_netlink.c. Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
netlink_sendskb does not use third argument. Clean it and save a couple of bytes. Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
The code in netfilter/nfnetlink.c and in ./net/netlink/genetlink.c looks like outdated copy/paste from rtnetlink.c. Push them into sync with the original. Changes from v1: - deleted comment in nfnetlink_rcv_msg by request of Patrick McHardy Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Denis V. Lunev 提交于
There is no need to process outstanding netlink user->kernel packets during rtnl_unlock now. There is no rtnl_trylock in the rtnetlink_rcv anymore. Normal code path is the following: netlink_sendmsg netlink_unicast netlink_sendskb skb_queue_tail netlink_data_ready rtnetlink_rcv mutex_lock(&rtnl_mutex); netlink_run_queue(sk, qlen, &rtnetlink_rcv_msg); mutex_unlock(&rtnl_mutex); So, it is possible, that packets can be present in the rtnl->sk_receive_queue during rtnl_unlock, but there is no need to process them at that moment as rtnetlink_rcv for that packet is pending. Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tony Battersby 提交于
If kernel_accept() returns an error, it may pass back a pointer to freed memory (which the caller should ignore). Make it pass back NULL instead for better safety. Signed-off-by: NTony Battersby <tonyb@cybernetics.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Add port randomization rather than a simple fixed rover for use with SCTP. This makes it act similar to TCP, UDP, DCCP when allocating ports. No longer need port_alloc_lock as well (suggestion by Brian Haley). Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
The fourth parameter of /proc/net/psched is supposed to show the timer resultion and is used by HTB userspace to calculate the necessary burst rate. Currently we show the clock resolution, which results in a too low burst rate when the two differ. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch makes the IPv4 x->type->input functions return the next protocol instead of setting it directly. This is identical to how we do things in IPv6 and will help us merge common code on the input path. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch moves the setting of the IP length and checksum fields out of the transforms and into the xfrmX_output functions. This would help future efforts in merging the transforms themselves. It also adds an optimisation to ipcomp due to the fact that the transport offset is guaranteed to be zero. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch removes the duplicate ipv6_{auth,esp,comp}_hdr structures since they're identical to the IPv4 versions. Duplicating them would only create problems for ourselves later when we need to add things like extended sequence numbers. I've also added transport header type conversion headers for these types which are now used by the transforms. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The IPv6 calling convention for x->mode->output is more general and could help an eventual protocol-generic x->type->output implementation. This patch adopts it for IPv4 as well and modifies the IPv4 type output functions accordingly. It also rewrites the IPv6 mac/transport header calculation to be based off the network header where practical. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch changes the calling convention so that on entry from x->mode->output and before entry into x->type->output skb->data will point to the payload instead of the IP header. This is essentially a redistribution of skb_push/skb_pull calls with the aim of minimising them on the common path of tunnel + ESP. It'll also let us use the same calling convention between IPv4 and IPv6 with the next patch. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The beet output function completely kills any extension headers by replacing them with the IPv6 header. This is because it essentially ignores the result of ip6_find_1stfragopt by simply acting as if there aren't any extension headers. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
I pointed this out back when this patch was first proposed but it looks like it got lost along the way. The checksum only needs to be ignored for NAT-T in transport mode where we lose the original inner addresses due to NAT. With BEET the inner addresses will be intact so the checksum remains valid. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mitsuru Chinen 提交于
To judge the timing for DAD, netif_carrier_ok() is used. However, there is a possibility that dev->qdisc stays noop_qdisc even if netif_carrier_ok() returns true. In that case, DAD NS is not sent out. We need to defer the IPv6 device initialization until a valid qdisc is specified. Signed-off-by: NMitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The unregister_netdevice() and dev_change_net_namespace() both check for dev->flags to be IFF_UP before calling the dev_close(), but the dev_close() checks for IFF_UP itself, so remove those unneeded checks. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ilpo Järvinen 提交于
Follows own function for each task principle, this is really somewhat separate task being done in sacktag. Also reduces indentation. In addition, added ack_seq local var to break some long lines & fixed coding style things. Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Just switch to the consolidated code. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Just switch to the consolidated code Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Just switch to the consolidated code. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Just switch to the consolidated calls. ipt_recent() has to initialize the private, so use the __seq_open_private() helper. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This concerns the ipv4 and ipv6 code mostly, but also the netlink and unix sockets. The netlink code is an example of how to use the __seq_open_private() call - it saves the net namespace on this private. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mattias Nissler 提交于
The decryption handlers will skip the frame if the RX_FLAG_DECRYPTED flag is set, so the early flag setting introduced by Johannes breaks decryption. To work around this, call the handlers first and then set the flag. Signed-off-by: NMattias Nissler <mattias.nissler@gmx.de> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
由 John W. Linville 提交于
Problem description by Daniel Drake <dsd@gentoo.org>: "This sequence of events causes loss of connectivity: <plug in> <associate as normal in managed mode> ifconfig eth7 down iwconfig eth7 mode monitor ifconfig eth7 up ifconfig eth7 down iwconfig eth7 mode managed <associate as normal> At this point you are associated but TX does not work. This is because the eth7 hard_start_xmit is still ieee80211_monitor_start_xmit." The problem is caused by ieee80211_if_set_type checking for a non-zero hard_start_xmit pointer value in order to avoid changing that value for master devices. The fix is to make that check more explicitly linked to master devices rather than simply checking if the value has been previously set. CC: Daniel Drake <dsd@gentoo.org> Acked-by: NMichael Wu <flamingice@sourmilk.net> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
由 Herbert Xu 提交于
This patch releases the lock on the state before calling x->type->output. It also adds the lock to the spots where they're currently needed. Most of those places (all except mip6) are expected to disappear with async crypto. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch adds locking so that when we're copying non-atomic fields such as life-time or coaddr to user-space we don't get a partial result. For af_key I've changed every instance of pfkey_xfrm_state2msg apart from expiration notification to include the keys and life-times. This is in-line with XFRM behaviour. The actual cases affected are: * pfkey_getspi: No change as we don't have any keys to copy. * key_notify_sa: + ADD/UPD: This wouldn't work otherwise. + DEL: It can't hurt. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Here's a good example of code duplication leading to code rot. The notification patch did its own netlink message creation for xfrm states. It duplicated code that was already in dump_one_state. Guess what, the next time (and the time after) when someone updated dump_one_state the notification path got zilch. This patch moves that code from dump_one_state to copy_to_user_state_extra and uses it in xfrm_notify_sa too. Unfortunately whoever updates this still needs to update xfrm_sa_len since the notification path wants to know the exact size for allocation. At least I've added a comment saying so and if someone still forgest, we'll have a WARN_ON telling us so. I also changed the security size calculation to use xfrm_user_sec_ctx since that's what we actually put into the skb. However it makes no practical difference since it has the same size as xfrm_sec_ctx. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch moves some common code that conceptually belongs to the xfrm core from af_key/xfrm_user into xfrm_alloc_spi. In particular, the spin lock on the state is now taken inside xfrm_alloc_spi. Previously it also protected the construction of the response PF_KEY/XFRM messages to user-space. This is inconsistent as other identical constructions are not protected by the state lock. This is bad because they in fact should be protected but only in certain spots (so as not to hold the lock for too long which may cause packet drops). The SPI byte order conversion has also been moved. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
There is no point in waking people up when creating/updating larval states because they'll just go back to sleep again as larval states by definition cannot be found by xfrm_state_find. We should only wake them up when the larvals mature or die. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Current the x->mode->output functions store the IPv6 nh pointer in the skb network header. This is inconvenient because the network header then has to be fixed up before the packet can leave the IPsec stack. The mac header field is unused on output so we can use that to store this instead. This patch does that and removes the network header fix-up in xfrm_output. It also uses ipv6_hdr where appropriate in the x->type->output functions. There is also a minor clean-up in esp4 to make it use the same code as esp6 to help any subsequent effort to merge the two. Lastly it kills two redundant skb_set_* statements in BEET that were simply copied over from transport mode. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Constructs of the form xfrm_state_hold(x); foo(x); xfrm_state_put(x); tend to be broken because foo is either synchronous where this is totally unnecessary or if foo is asynchronous then the reference count is in the wrong spot. In the case of xfrm_secpath_reject, the function is synchronous and therefore we should just kill the reference count. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The newly created net namespace is set to 0 with memset() in setup_net(). The setup_net() is also called for the init_net_ns(), which is zeroed naturally as a global var. So remove this memset and allocate new nets with the kmem_cache_zalloc(). Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin Thery 提交于
In ip6_fib.c, fib6_clean_node() casts a fib6_walker_t pointer to a fib6_cleaner_t pointer assuming a struct fib6_walker_t (field 'w') is the first field in struct fib6_walker_t. To prevent any future problems that may occur if one day a field is inadvertently inserted before the 'w' field in struct fib6_cleaner_t, (and to improve readability), this patch uses the container_of() macro. Signed-off-by: NBenjamin Thery <benjamin.thery@bull.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
With the net namespaces many code leaved the __init section, thus making the kernel occupy more memory than it did before. Since we have a config option that prohibits the namespace creation, the functions that initialize/finalize some netns stuff are simply not needed and can be freed after the boot. Currently, this is almost not noticeable, since few calls are no longer in __init, but when the namespaces will be merged it will be possible to free more code. I propose to use the __net_init, __net_exit and __net_initdata "attributes" for functions/variables that are not used if the CONFIG_NET_NS is not set to save more space in memory. The exiting functions cannot just reside in the __exit section, as noticed by David, since the init section will have references on it and the compilation will fail due to modpost checks. These references can exist, since the init namespace never dies and the exit callbacks are never called. So I introduce the __exit_refok attribute just like it is already done with the __init_refok. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
A net_device struct provides field dev_id. It is used for unique ipv6 generation in case of shared network cards (as for the OSA network cards of IBM System z). If VLAN devices are built on top of such shared network cards, this dev_id information needs to be transferred to the VLAN device. Signed-off-by: NUrsula Braun <braunu@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The lastused update check in xfrm_output can be done just as well in the mode output function which is specific to RO. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Now that the only callers of xfrm_replay_notify are in xfrm, we can remove the export. This patch also removes xfrm_aevent_doreplay since it's now called in just one spot. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The replay counter is one of only two remaining things in the output code that requires a lock on the xfrm state (the other being the crypto). This patch moves it into the generic xfrm_output so we can remove the lock from the transforms themselves. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-