- 16 6月, 2011 5 次提交
-
-
由 Julian Anastasov 提交于
Avoid double seq adjustment for loopback traffic because it causes silent repetition of TCP data. One example is passive FTP with DNAT rule and difference in the length of IP addresses. This patch adds check if packet is sent and received via loopback device. As the same conntrack is used both for outgoing and incoming direction, we restrict seq adjustment to happen only in POSTROUTING. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Nicolas Cavallari 提交于
By default, when broadcast or multicast packet are sent from a local application, they are sent to the interface then looped by the kernel to other local applications, going throught netfilter hooks in the process. These looped packet have their MAC header removed from the skb by the kernel looping code. This confuse various netfilter's netlink queue, netlink log and the legacy ip_queue, because they try to extract a hardware address from these packets, but extracts a part of the IP header instead. This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header if there is none in the packet. Signed-off-by: NNicolas Cavallari <cavallar@lri.fr> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Patrick McHardy 提交于
Userspace allows to specify inversion for IP header ECN matches, the kernel silently accepts it, but doesn't invert the match result. Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Patrick McHardy 提交于
Check for protocol inversion in ecn_mt_check() and remove the unnecessary runtime check for IPPROTO_TCP in ecn_mt(). Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 10 6月, 2011 1 次提交
-
-
由 Steffen Klassert 提交于
We assume that transhdrlen is positive on the first fragment which is wrong for raw packets. So we don't add exthdrlen to the packet size for raw packets. This leads to a reallocation on IPsec because we have not enough headroom on the skb to place the IPsec headers. This patch fixes this by adding exthdrlen to the packet size whenever the send queue of the socket is empty. This issue was introduced with git commit 1470ddf7 (inet: Remove explicit write references to sk/inet in ip_append_data) Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 6月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
commit 2c8cec5c (ipv4: Cache learned PMTU information in inetpeer) added some racy peer->pmtu_expires accesses. As its value can be changed by another cpu/thread, we should be more careful, reading its value once. Add peer_pmtu_expired() and peer_pmtu_cleaned() helpers Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 6月, 2011 4 次提交
-
-
由 Dave Jones 提交于
Netlink message lengths can't be negative, so use unsigned variables. Signed-off-by: NDave Jones <davej@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This patch fixes a refcount leak of ct objects that may occur if l4proto->error() assigns one conntrack object to one skbuff. In that case, we have to skip further processing in nf_conntrack_in(). With this patch, we can also fix wrong return values (-NF_ACCEPT) for special cases in ICMP[v6] that should not bump the invalid/error statistic counters. Reported-by: NZoltan Menyhart <Zoltan.Menyhart@bull.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
Fix crash in nf_nat_csum when mangling packets in OUTPUT hook where skb->dev is not defined, it is set later before POSTROUTING. Problem happens for CHECKSUM_NONE. We can check device from rt but using CHECKSUM_PARTIAL should be safe (skb_checksum_help). Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Eric Dumazet 提交于
Following error is raised (and other similar ones) : net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’: net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’ not in enumerated type ‘enum ip_conntrack_info’ gcc barfs on adding two enum values and getting a not enumerated result : case IP_CT_RELATED+IP_CT_IS_REPLY: Add missing enum values Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 02 6月, 2011 1 次提交
-
-
由 Marcus Meissner 提交于
Check against mistakenly passing in IPv6 addresses (which would result in an INADDR_ANY bind) or similar incompatible sockaddrs. Signed-off-by: NMarcus Meissner <meissner@suse.de> Cc: Reinhard Max <max@suse.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2011 1 次提交
-
-
由 Chris Metcalf 提交于
The current code takes an unaligned pointer and does htonl() on it to make it big-endian, then does a memcpy(). The problem is that the compiler decides that since the pointer is to a __be32, it is legal to optimize the copy into a processor word store. However, on an architecture that does not handled unaligned writes in kernel space, this produces an unaligned exception fault. The solution is to track the pointer as a "char *" (which removes a bunch of unpleasant casts in any case), and then just use put_unaligned_be32() to write the value to memory. Signed-off-by: NChris Metcalf <cmetcalf@tilera.com> Signed-off-by: NDavid S. Miller <davem@zippy.davemloft.net>
-
- 28 5月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Several crashes in cleanup_once() were reported in recent kernels. Commit d6cc1d64 (inetpeer: various changes) added a race in unlink_from_unused(). One way to avoid taking unused_peers.lock before doing the list_empty() test is to catch 0->1 refcnt transitions, using full barrier atomic operations variants (atomic_cmpxchg() and atomic_inc_return()) instead of previous atomic_inc() and atomic_add_unless() variants. We then call unlink_from_unused() only for the owner of the 0->1 transition. Add a new atomic_add_unless_return() static helper With help from Arun Sharma. Refs: https://bugzilla.kernel.org/show_bug.cgi?id=32772Reported-by: NArun Sharma <asharma@fb.com> Reported-by: NMaximilian Engelhardt <maxi@daemonizer.de> Reported-by: NYann Dupont <Yann.Dupont@univ-nantes.fr> Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 5月, 2011 1 次提交
-
-
由 Veaceslav Falico 提交于
In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number of source filters per mulitcast. However, igmp_group_dropped() is also called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source filters. To fix that, we must clear the source filters only when there are no users in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy. Acked-by: NDavid L Stevens <dlstevens@us.ibm.com> Signed-off-by: NVeaceslav Falico <vfalico@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 5月, 2011 3 次提交
-
-
由 Eric Dumazet 提交于
All static seqlock should be initialized with the lockdep friendly __SEQLOCK_UNLOCKED() macro. Remove legacy SEQLOCK_UNLOCKED() macro. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/%3C1306238888.3026.31.camel%40edumazet-laptop%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Dan Rosenberg 提交于
The %pK format specifier is designed to hide exposed kernel pointers, specifically via /proc interfaces. Exposing these pointers provides an easy target for kernel write vulnerabilities, since they reveal the locations of writable structures containing easily triggerable function pointers. The behavior of %pK depends on the kptr_restrict sysctl. If kptr_restrict is set to 0, no deviation from the standard %p behavior occurs. If kptr_restrict is set to 1, the default, if the current user (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG (currently in the LSM tree), kernel pointers using %pK are printed as 0's. If kptr_restrict is set to 2, kernel pointers using %pK are printed as 0's regardless of privileges. Replacing with 0's was chosen over the default "(null)", which cannot be parsed by userland %p, which expects "(nil)". The supporting code for kptr_restrict and %pK are currently in the -mm tree. This patch converts users of %p in net/ to %pK. Cases of printing pointers to the syslog are not covered, since this would eliminate useful information for postmortem debugging and the reading of the syslog is already optionally protected by the dmesg_restrict sysctl. Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com> Cc: James Morris <jmorris@namei.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Thomas Graf <tgraf@infradead.org> Cc: Eugene Teo <eugeneteo@kernel.org> Cc: Kees Cook <kees.cook@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
net/ipv4/ping.c: In function ‘ping_v4_unhash’: net/ipv4/ping.c:140:28: warning: variable ‘hslot’ set but not used Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Vasiliy Kulikov <segoon@openwall.com> Acked-by: NVasiliy Kulikov <segoon@openwall.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 5月, 2011 3 次提交
-
-
由 Paul Gortmaker 提交于
After discovering that wide use of prefetch on modern CPUs could be a net loss instead of a win, net drivers which were relying on the implicit inclusion of prefetch.h via the list headers showed up in the resulting cleanup fallout. Give them an explicit include via the following $0.02 script. ========================================= #!/bin/bash MANUAL="" for i in `git grep -l 'prefetch(.*)' .` ; do grep -q '<linux/prefetch.h>' $i if [ $? = 0 ] ; then continue fi ( echo '?^#include <linux/?a' echo '#include <linux/prefetch.h>' echo . echo w echo q ) | ed -s $i > /dev/null 2>&1 if [ $? != 0 ]; then echo $i needs manual fixup MANUAL="$i $MANUAL" fi done echo ------------------- 8\<---------------------- echo vi $MANUAL ========================================= Signed-off-by: NPaul <paul.gortmaker@windriver.com> [ Fixed up some incorrect #include placements, and added some non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jones 提交于
Add a stack backtrace to the ip_rt_bug path for debugging Signed-off-by: NDave Jones <davej@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2011 3 次提交
-
-
由 Micha Nelissen 提交于
v3 -> v4: fix return boolean false instead of 0 for ic_is_init_dev Currently the ip auto configuration has a hardcoded delay of 1 second. When (ethernet) link takes longer to come up (e.g. more than 3 seconds), nfs root may not be found. Remove the hardcoded delay, and wait for carrier on at least one network device. Signed-off-by: NMicha Nelissen <micha@neli.hopto.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Changli Gao 提交于
The characters in a line should be no more than 80. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Changli Gao 提交于
As these functions are only used in this file. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 5月, 2011 4 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This will next trickle down to rt_bind_peer(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This way the caller can get at the fully resolved fl4->{daddr,saddr} etc. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It's way past it's usefulness. And this gets rid of a bunch of stray ->rt_{dst,src} references. Even the comment documenting the macro was inaccurate (stated default was 1 when it's 0). If reintroduced, it should be done properly, with dynamic debug facilities. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 5月, 2011 2 次提交
-
-
由 David S. Miller 提交于
Noticed by Joe Perches. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vasiliy Kulikov 提交于
If CONFIG_PROC_SYSCTL=n the building process fails: ping.c:(.text+0x52af3): undefined reference to `inet_get_ping_group_range_net' Moved inet_get_ping_group_range_net() to ping.c. Reported-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 5月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
Commit 6623e3b2 (ipv4: IP defragmentation must be ECN aware) was an attempt to not lose "Congestion Experienced" (CE) indications when performing datagram defragmentation. Stefanos Harhalakis raised the point that RFC 3168 requirements were not completely met by this commit. In particular, we MUST detect invalid combinations and eventually drop illegal frames. Reported-by: NStefanos Harhalakis <v13@v13.gr> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
At these points we have a fully filled in value via the IP header the form of ip_hdr(skb)->saddr Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 5月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
udp_ioctl() really handles UDP and UDPLite protocols. 1) It can increment UDP_MIB_INERRORS in case first_packet_length() finds a frame with bad checksum. 2) It has a dependency on sizeof(struct udphdr), not applicable to ICMP/PING If ping sockets need to handle SIOCINQ/SIOCOUTQ ioctl, this should be done differently. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Vasiliy Kulikov <segoon@openwall.com> Acked-by: NVasiliy Kulikov <segoon@openwall.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 5月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
ping_table is not __read_mostly, since it contains one rwlock, and is static to ping.c ping_port_rover & ping_v4_lookup are static Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NVasiliy Kulikov <segoon@openwall.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 5月, 2011 5 次提交
-
-
由 David S. Miller 提交于
At this point iph->daddr equals what rt->rt_dst would hold. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Pass in the sk_buff so that we can fetch the necessary keys from the packet header when working with input routes. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This will allow ip_options_build() to reliably look at the values of iph->{daddr,saddr} Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This code block executes when opt->srr_is_hit is set. It will be set only by ip_options_rcv_srr(). ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR option addresses, and when it matches one 1) looks up the route for that nexthop and 2) on route lookup success it writes that nexthop value into iph->daddr. ip_forward_options() runs later, and again walks the SRR option addresses looking for the option matching the destination of the route stored in skb_rtable(). This route will be the same exact one looked up for the nexthop by ip_options_rcv_srr(). Therefore "rt->rt_dst == iph->daddr" must be true. All it really needs to do is record the route's source address in the matching SRR option adddress. It need not write iph->daddr again, since that has already been done by ip_options_rcv_srr() as detailed above. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vasiliy Kulikov 提交于
This patch adds IPPROTO_ICMP socket kind. It makes it possible to send ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages without any special privileges. In other words, the patch makes it possible to implement setuid-less and CAP_NET_RAW-less /bin/ping. In order not to increase the kernel's attack surface, the new functionality is disabled by default, but is enabled at bootup by supporting Linux distributions, optionally with restriction to a group or a group range (see below). Similar functionality is implemented in Mac OS X: http://www.manpagez.com/man/4/icmp/ A new ping socket is created with socket(PF_INET, SOCK_DGRAM, PROT_ICMP) Message identifiers (octets 4-5 of ICMP header) are interpreted as local ports. Addresses are stored in struct sockaddr_in. No port numbers are reserved for privileged processes, port 0 is reserved for API ("let the kernel pick a free number"). There is no notion of remote ports, remote port numbers provided by the user (e.g. in connect()) are ignored. Data sent and received include ICMP headers. This is deliberate to: 1) Avoid the need to transport headers values like sequence numbers by other means. 2) Make it easier to port existing programs using raw sockets. ICMP headers given to send() are checked and sanitized. The type must be ICMP_ECHO and the code must be zero (future extensions might relax this, see below). The id is set to the number (local port) of the socket, the checksum is always recomputed. ICMP reply packets received from the network are demultiplexed according to their id's, and are returned by recv() without any modifications. IP header information and ICMP errors of those packets may be obtained via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source quenches and redirects are reported as fake errors via the error queue (IP_RECVERR); the next hop address for redirects is saved to ee_info (in network order). socket(2) is restricted to the group range specified in "/proc/sys/net/ipv4/ping_group_range". It is "1 0" by default, meaning that nobody (not even root) may create ping sockets. Setting it to "100 100" would grant permissions to the single group (to either make /sbin/ping g+s and owned by this group or to grant permissions to the "netadmins" group), "0 4294967295" would enable it for the world, "100 4294967295" would enable it for the users, but not daemons. The existing code might be (in the unlikely case anyone needs it) extended rather easily to handle other similar pairs of ICMP messages (Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply etc.). Userspace ping util & patch for it: http://openwall.info/wiki/people/segoon/ping For Openwall GNU/*/Linux it was the last step on the road to the setuid-less distro. A revision of this patch (for RHEL5/OpenVZ kernels) is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs: http://mirrors.kernel.org/openwall/Owl/current/iso/ Initially this functionality was written by Pavel Kankovsky for Linux 2.4.32, but unfortunately it was never made public. All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with the patch. PATCH v3: - switched to flowi4. - minor changes to be consistent with raw sockets code. PATCH v2: - changed ping_debug() to pr_debug(). - removed CONFIG_IP_PING. - removed ping_seq_fops.owner field (unused for procfs). - switched to proc_net_fops_create(). - switched to %pK in seq_printf(). PATCH v1: - fixed checksumming bug. - CAP_NET_RAW may not create icmp sockets anymore. RFC v2: - minor cleanups. - introduced sysctl'able group range to restrict socket(2). Signed-off-by: NVasiliy Kulikov <segoon@openwall.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 5月, 2011 1 次提交
-
-
由 David S. Miller 提交于
I swear none of my compilers warned about this, yet it is so obvious. > net/ipv4/ip_forward.c: In function 'ip_forward': > net/ipv4/ip_forward.c:87: warning: 'iph' may be used uninitialized in this function Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-