- 13 3月, 2011 2 次提交
-
-
由 David S. Miller 提交于
I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 3月, 2011 3 次提交
-
-
由 David S. Miller 提交于
Completely unused. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
fl->oif will always be zero on the input path, so there is no reason to test for that. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
All callers are under rcu_read_lock() protection already. Rename to ip_check_mc_rcu() to make it even more clear. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2011 7 次提交
-
-
由 Stephen Hemminger 提交于
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Like in commit 44713b67 ("ipv4: Optimize flow initialization in output route lookup." we can optimize the on-stack flow setup to only initialize the members which are actually used. Otherwise we bzero the entire structure, then initialize explicitly the first half of it. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Like in commit 44713b67 ("ipv4: Optimize flow initialization in output route lookup." we can optimize the on-stack flow setup to only initialize the members which are actually used. Otherwise we bzero the entire structure, then initialize explicitly the first half of it. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vasiliy Kulikov 提交于
Since a8f80e8f any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod | grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod | grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NKees Cook <kees.cook@canonical.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Mario Schuknecht 提交于
In contrast to SIOCOUTQ which returns the amount of data sent but not yet acknowledged plus data not yet sent this patch only returns the data not sent. For various methods of live streaming bitrate control it may be helpful to know how much data are in the tcp outqueue are not sent yet. Signed-off-by: NMario Schuknecht <m.schuknecht@dresearch.de> Signed-off-by: NSteffen Sledz <sledz@dresearch.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Create a common helper for this operation, since we do it identically in three spots. Suggested by Eric Dumazet. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
In usual cases ifa_address == ifa_local, but in the case where SIOCSIFDSTADDR sets the destination address on a point-to-point link, ifa_address gets set to that destination address. Therefore we should use ifa_local when we want the local interface address. There were two cases where the selection was done incorrectly: 1) When devinet_ioctl() does matching, it checks ifa_address even though gifconf correct reported ifa_local to the user 2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using ifa_address instead of ifa_local. Reported-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 3月, 2011 2 次提交
-
-
由 David S. Miller 提交于
If modifications on other cpus are ok, then modifications to the tree during lookup done by the local cpu are ok too. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We have to use cfg->fc_scope not the final nh_scope value. Reported-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2011 3 次提交
-
-
由 David S. Miller 提交于
When doing output route lookups, we have to select the source address if the user has not specified an explicit one. First, if the route has an explicit preferred source address specified, then we use that. Otherwise we search the route's outgoing interface for a suitable address. This search can be precomputed and cached at route insertion time. The only missing part is that we have to refresh this precomputed value any time addresses are added or removed from the interface, and this is accomplished by fib_update_nh_saddrs(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This elimiates a lot of pure overhead due to parameter passing. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
fib_semantic_match() requires that if the type doesn't signal an automatic error, it must be of type RTN_UNICAST, RTN_LOCAL, RTN_BROADCAST, RTN_ANYCAST, or RTN_MULTICAST. Checking this every route lookup is pointless work. Instead validate it during route insertion, via fib_create_info(). Also, there was nothing making sure the type value was less than RTN_MAX, so add that missing check while we're here. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 3月, 2011 5 次提交
-
-
由 David S. Miller 提交于
The only necessary parts are the src/dst addresses, the interface indexes, the TOS, and the mark. The rest is unnecessary bloat, which amounts to nearly 50 bytes on 64-bit. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
rt->rt_iif is only ever inspected on input routes, for example DCCP uses this to populate a route lookup flow key when generating replies to another packet. Therefore, setting it to anything other than zero on output routes makes no sense. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We know this is a new route object, so doing atomics and stuff makes no sense at all. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We burn a lot of useless cycles, cpu store buffer traffic, and memory operations memset()'ing the on-stack flow used to perform output route lookups in __ip_route_output_key(). Only the first half of the flow object members even matter for output route lookups in this context, specifically: FIB rules matching cares about: dst, src, tos, iif, oif, mark FIB trie lookup cares about: dst FIB semantic match cares about: tos, scope, oif Therefore only initialize these specific members and elide the memset entirely. On Niagara2 this kills about ~300 cycles from the output route lookup path. Likely, we can take things further, since all callers of output route lookups essentially throw away the on-stack flow they use. So they don't care if we use it as a scratch-pad to compute the final flow key. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
-
由 Eric Dumazet 提交于
David noticed : ------------------ Eric, I was profiling the non-routing-cache case and something that stuck out is the case of calling inet_getpeer() with create==0. If an entry is not found, we have to redo the lookup under a spinlock to make certain that a concurrent writer rebalancing the tree does not "hide" an existing entry from us. This makes the case of a create==0 lookup for a not-present entry really expensive. It is on the order of 600 cpu cycles on my Niagara2. I added a hack to not do the relookup under the lock when create==0 and it now costs less than 300 cycles. This is now a pretty common operation with the way we handle COW'd metrics, so I think it's definitely worth optimizing. ----------------- One solution is to use a seqlock instead of a spinlock to protect struct inet_peer_base. After a failed avl tree lookup, we can easily detect if a writer did some changes during our lookup. Taking the lock and redo the lookup is only necessary in this case. Note: Add one private rcu_deref_locked() macro to place in one spot the access to spinlock included in seqlock. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 3月, 2011 2 次提交
-
-
由 David S. Miller 提交于
Reported-by: NStephen Hemminger <shemminger@vyatta.com> Reported-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
As reported by Eric: [11483.697233] IP: [<c12b0638>] dst_release+0x18/0x60 ... [11483.697741] Call Trace: [11483.697764] [<c12fc9d2>] udp_sendmsg+0x282/0x6e0 [11483.697790] [<c12a1c01>] ? memcpy_toiovec+0x51/0x70 [11483.697818] [<c12dbd90>] ? ip_generic_getfrag+0x0/0xb0 The pointer passed to dst_release() is -EINVAL, that's because we leave an error pointer in the local variable "rt" by accident. NULL it out to fix the bug. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 3月, 2011 3 次提交
-
-
由 David S. Miller 提交于
This avoid a stack frame at zero cost. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Instead of on the stack. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Instead of on the stack. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2011 13 次提交
-
-
由 Herbert Xu 提交于
The patch to replace inet->cork with cork left out two spots in __ip_append_data that can result in bogus packet construction. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The route lookup code in icmp_send() is slightly tricky as a result of having to handle all of the requirements of RFC 4301 host relookups. Pull the route resolution into a seperate function, so that the error handling and route reference counting is hopefully easier to see and contained wholly within this new routine. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This can be determined from the flow flags instead. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This boolean state is now available in the flow flags. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
And set is in contexts where the route resolution can sleep. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Since that is what the current vague "flags" argument means. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Since that's what the current vague "flags" thing means. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The UDP transmit path has been running under the socket lock for a long time because of the corking feature. This means that transmitting to the same socket in multiple threads does not scale at all. However, as most users don't actually use corking, the locking can be removed in the common case. This patch creates a lockless fast path where corking is not used. Please note that this does create a slight inaccuracy in the enforcement of socket send buffer limits. In particular, we may exceed the socket limit by up to (number of CPUs) * (packet size) because of the way the limit is computed. As the primary purpose of socket buffers is to indicate congestion, this should not be a great problem for now. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch converts UDP to use the new ip_finish_skb API. This would then allows us to more easily use ip_make_skb which allows UDP to run without a socket lock. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
This patch adds the helper ip_make_skb which is like ip_append_data and ip_push_pending_frames all rolled into one, except that it does not send the skb produced. The sending part is carried out by ip_send_skb, which the transport protocol can call after it has tweaked the skb. It is meant to be called in cases where corking is not used should have a one-to-one correspondence to sendmsg. This patch also adds the helper ip_finish_skb which is meant to be replace ip_push_pending_frames when corking is required. Previously the protocol stack would peek at the socket write queue and add its header to the first packet. With ip_finish_skb, the protocol stack can directly operate on the final skb instead, just like the non-corking case with ip_make_skb. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
In order to allow simultaneous calls to ip_append_data on the same socket, it must not modify any shared state in sk or inet (other than those that are designed to allow that such as atomic counters). This patch abstracts out write references to sk and inet_sk in ip_append_data and its friends so that we may use the underlying code in parallel. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-