- 21 9月, 2010 2 次提交
-
-
由 Julian Anastasov 提交于
Add new sysctl flag "snat_reroute". Recent kernels use ip_route_me_harder() to route LVS-NAT responses properly by VIP when there are multiple paths to client. But setups that do not have alternative default routes can skip this routing lookup by using snat_reroute=0. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Julian Anastasov 提交于
Add more code to IPVS to work with Netfilter connection tracking and fix some problems. - Allow IPVS to be compiled without connection tracking as in 2.6.35 and before. This can avoid keeping conntracks for all IPVS connections because this costs memory. ip_vs_ftp still depends on connection tracking and NAT as implemented for 2.6.36. - Add sysctl var "conntrack" to enable connection tracking for all IPVS connections. For loaded IPVS directors it needs tuning of nf_conntrack_max limit. - Add IP_VS_CONN_F_NFCT connection flag to request the connection to use connection tracking. This allows user space to provide this flag, for example, in dest->conn_flags. This can be useful to request connection tracking per real server instead of forcing it for all connections with the "conntrack" sysctl. This flag is set currently only by ip_vs_ftp and of course by "conntrack" sysctl. - Add ip_vs_nfct.c file to hold all connection tracking code, by this way main code should not depend of netfilter conntrack support. - Return back the ip_vs_post_routing handler as in 2.6.35 and use skb->ipvs_property=1 to allow IPVS to work without connection tracking Connection tracking: - most of the code is already in 2.6.36-rc - alter conntrack reply tuple for LVS-NAT connections when first packet from client is forwarded and conntrack state is NEW or RELATED. Additionally, alter reply for RELATED connections from real server, again for packet in original direction. - add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering reply) for LVS-TUN early because we want to call nf_reset. It is needed because we add IPIP header and the original conntrack should be preserved, not destroyed. The transmitted IPIP packets can reuse same conntrack, so we do not set skb->ipvs_property. - try to destroy conntrack when the IPVS connection is destroyed. It is not fatal if conntrack disappears before that, it depends on the used timers. Fix problems from long time: - add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 17 9月, 2010 1 次提交
-
-
由 Julian Anastasov 提交于
- the sync protocol supports 16 bits only, so bits 0..15 should be used only for flags that should go to backup server, bits 16 and above should be allocated for flags not sent to backup. - use IP_VS_CONN_F_DEST_MASK as mask of connection flags in destination that can be changed by user space - allow IP_VS_CONN_F_ONE_PACKET to be set in destination Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 09 9月, 2010 1 次提交
-
-
由 Julian Anastasov 提交于
- Do not create expectation when forwarding the PORT command to avoid blocking the connection. The problem is that nf_conntrack_ftp.c:help() tries to create the same expectation later in POST_ROUTING and drops the packet with "dropping packet" message after failure in nf_ct_expect_related. - Change ip_vs_update_conntrack to alter the conntrack for related connections from real server. If we do not alter the reply in this direction the next packet from client sent to vport 20 comes as NEW connection. We alter it but may be some collision happens for both conntracks and the second conntrack gets destroyed immediately. The connection stucks too. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 8月, 2010 1 次提交
-
-
由 Simon Horman 提交于
This removes duplicate code by providing a default implementation which is used by 3 of the 4 modules that provide these call. Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 23 7月, 2010 1 次提交
-
-
由 Hannes Eder 提交于
Use nf_conntrack/nf_nat code to do the packet mangling and the TCP sequence adjusting. The function 'ip_vs_skb_replace' is now dead code, so it is removed. To SNAT FTP, use something like: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vport 21 -j SNAT --to-source 192.168.10.10 and for the data connections in passive mode: % iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \ --vportctl 21 -j SNAT --to-source 192.168.10.10 using '-m state --state RELATED' would also works. Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and nf_nat_ftp are loaded. [ up-port and minor fixes by Simon Horman <horms@verge.net.au> ] Signed-off-by: NHannes Eder <heder@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 18 2月, 2010 1 次提交
-
-
由 Venkata Mohan Reddy 提交于
Enhance IPVS to load balance SCTP transport protocol packets. This is done based on the SCTP rfc 4960. All possible control chunks have been taken care. The state machine used in this code looks some what lengthy. I tried to make the state machine easy to understand. Signed-off-by: NVenkata Mohan Reddy Koppula <mohanreddykv@gmail.com> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 05 1月, 2010 1 次提交
-
-
由 Catalin(ux) M. BOIE 提交于
I was very frustrated about the fact that I have to recompile the kernel to change the hash size. So, I created this patch. If IPVS is built-in you can append ip_vs.conn_tab_bits=?? to kernel command line, or, if you built IPVS as modules, you can add options ip_vs conn_tab_bits=??. To keep everything backward compatible, you still can select the size at compile time, and that will be used as default. It has been about a year since this patch was originally posted and subsequently dropped on the basis of insufficient test data. Mark Bergsma has provided the following test results which seem to strongly support the need for larger hash table sizes: We do however run into the same problem with the default setting (212 = 4096 entries), as most of our LVS balancers handle around a million connections/SLAB entries at any point in time (around 100-150 kpps load). With only 4096 hash table entries this implies that each entry consists of a linked list of 256 connections *on average*. To provide some statistics, I did an oprofile run on an 2.6.31 kernel, with both the default 4096 table size, and the same kernel recompiled with IP_VS_CONN_TAB_BITS set to 18 (218 = 262144 entries). I built a quick test setup with a part of Wikimedia/Wikipedia's live traffic mirrored by the switch to the test host. With the default setting, at ~ 120 kpps packet load we saw a typical %si CPU usage of around 30-35%, and oprofile reported a hot spot in ip_vs_conn_in_get: samples % image name app name symbol name 1719761 42.3741 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 302577 7.4554 bnx2 bnx2 /bnx2 181984 4.4840 vmlinux vmlinux __ticket_spin_lock 128636 3.1695 vmlinux vmlinux ip_route_input 74345 1.8318 ip_vs.ko ip_vs.ko ip_vs_conn_out_get 68482 1.6874 vmlinux vmlinux mwait_idle After loading the recompiled kernel with 218 entries, %si CPU usage dropped in half to around 12-18%, and oprofile looks much healthier, with only 7% spent in ip_vs_conn_in_get: samples % image name app name symbol name 265641 14.4616 bnx2 bnx2 /bnx2 143251 7.7986 vmlinux vmlinux __ticket_spin_lock 140661 7.6576 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 94364 5.1372 vmlinux vmlinux mwait_idle 86267 4.6964 vmlinux vmlinux ip_route_input [ horms@verge.net.au: trivial up-port and minor style fixes ] Signed-off-by: NCatalin(ux) M. BOIE <catab@embedromix.ro> Cc: Mark Bergsma <mark@wikimedia.org> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 04 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 8月, 2009 1 次提交
-
-
由 Jan Engelhardt 提交于
String literals are constant, and usually, we can also tag the array of pointers const too, moving it to the .rodata section. Signed-off-by: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2009 1 次提交
-
-
由 Hannes Eder 提交于
Since pr_err and friends are used instead of printk there is no point in keeping IP_VS_ERR and friends. Furthermore make use of '__func__' instead of hard coded function names. Signed-off-by: NHannes Eder <heder@google.com> Acked-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2009 1 次提交
-
-
由 Hannes Eder 提交于
While being at it cleanup whitespace. Signed-off-by: NHannes Eder <heder@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 2月, 2009 1 次提交
-
-
由 Harvey Harrison 提交于
Base versions handle constant folding now. For headers exposed to userspace, we must only expose the __ prefixed versions. Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2008 1 次提交
-
-
由 Joe Perches 提交于
The first argument to csum_partial is const void * casts to char/u8 * are not necessary Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2008 1 次提交
-
-
由 Julius Volz 提交于
Remove the 'supports_ipv6' scheduler flag since all schedulers now support IPv6. Signed-off-by: NJulius Volz <julius.volz@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 10月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 10月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 10月, 2008 1 次提交
-
-
由 KOVACS Krisztian 提交于
inet_iif() in inet_sock.h requires route.h. Since users of inet_iif() usually require other route.h functionality anyway this patch moves inet_iif() to route.h. Signed-off-by: NKOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 9月, 2008 2 次提交
-
-
由 Sven Wegener 提交于
Instead of duplicating the fields, integrate a user stats structure into the kernel stats structure. This is more robust when the members are changed, because they are now automatically kept in sync. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Reviewed-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Sven Wegener 提交于
Instead of checking the value in include/net/ip_vs.h, we can just restrict the range in our Kconfig file. This will prevent values outside of the range early. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Reviewed-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
- 05 9月, 2008 12 次提交
-
-
由 Julius Volz 提交于
Adjust various debug outputs to use the new *_BUF macro variants for correct output of v4/v6 addresses. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Convert functions for looking up destinations (real servers) to support IPv6 services/dests. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add xmit functions for IPv6. Also add the already needed __ip_vs_get_out_rt_v6() to ip_vs_core.c. Bind the new xmit functions to v6 connections. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Extend functions for getting/creating connections and connection templates for IPv6 support and fix the callers. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Extend protocol DNAT/SNAT and state handlers to work with IPv6. Also change/introduce new checksumming helper functions for this. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add 'af' arguments to conn_schedule(), conn_in_get(), conn_out_get() and csum_check() function pointers in struct ip_vs_protocol. Extend the respective functions for TCP, UDP, AH and ESP and adjust the callers. The changes in the callers need to be somewhat extensive, since they now need to pass a filled out struct ip_vs_iphdr * to the modified functions instead of a struct iphdr *. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add 'supports_ipv6' flag to struct ip_vs_scheduler to indicate whether a scheduler supports IPv6. Set the flag to 1 in schedulers that work with IPv6, 0 otherwise. This flag is checked in a later patch while trying to add a service with a specific scheduler. Adjust debug in v6-supporting schedulers to work with both address families. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add support for selecting services based on their address family to ip_vs_service_get() and adjust the callers. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add extended internal versions of struct ip_vs_service_user and struct ip_vs_dest_user (the originals can't be modified as they are part of the old sockopt interface). Adjust ip_vs_ctl.c to work with the new data structures and add some minor AF-awareness. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add some debugging macros that allow conditional output of either v4 or v6 addresses, depending on an 'af' parameter. This is done by creating a temporary string buffer in an outer debug macro and writing addresses' string representations into it from another macro which can only be used when inside the outer one. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Add a struct ip_vs_iphdr for easier handling of common v4 and v6 header fields in the same code path. ip_vs_fill_iphdr() helps to fill this struct from an IPv4 or IPv6 header. Add further helper functions for copying and comparing addresses. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
由 Julius Volz 提交于
Introduce new 'af' fields into IPVS data structures for specifying an entry's address family. Convert IP addresses to be of type union nf_inet_addr. Signed-off-by: NJulius Volz <juliusv@google.com> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
- 15 8月, 2008 1 次提交
-
-
由 Sven Wegener 提交于
Commit 8ab19ea3 ("ipvs: Fix possible deadlock in estimator code") fixed a deadlock condition, but that condition can only happen during unload of IPVS, because during normal operation there is at least our global stats structure in the estimator list. The mod_timer() and del_timer_sync() calls are actually initialization and cleanup code in disguise. Let's make it explicit and move them to their own init and cleanup function. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Signed-off-by: NSimon Horman <horms@verge.net.au>
-
- 11 8月, 2008 3 次提交
-
-
由 Sven Wegener 提交于
There's no reason for dynamically allocating an estimator object for every stats object. Directly embed an estimator object into every stats object and switch to using the kernel-provided list implementation. This makes the code much simpler and faster, as we do not need to traverse the list of all estimators to find the one belonging to a stats object. There's no need to use an rwlock, as we only have one reader. Also reorder the members of the estimator structure slightly to avoid padding overhead. This can't be done with the stats object as the members are currently copied to our user space object via memcpy() and changing it would break ABI. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Acked-by: NSimon Horman <horms@verge.net.au>
-
由 Sven Wegener 提交于
Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Acked-by: NSimon Horman <horms@verge.net.au>
-
由 Sven Wegener 提交于
Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Acked-by: NSimon Horman <horms@verge.net.au>
-
- 01 8月, 2008 1 次提交
-
-
由 Julius Volz 提交于
Current versions of ipvsadm include "/usr/src/linux/include/net/ip_vs.h" directly. This file also contains kernel-only definitions. Normally, public definitions should live in include/linux, so this patch moves the definitions shared with userspace to a new file, "include/linux/ip_vs.h". This also removes the unused NFC_IPVS_PROPERTY bitmask, which was once used to point into skb->nfcache. To make old ipvsadms still compile with this, the old header file includes the new one. Thanks to Dave Miller and Horms for noting/adding the missing Kbuild entry for the new header file. Signed-off-by: NJulius Volz <juliusv@google.com> Acked-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2008 1 次提交
-
-
由 Julian Anastasov 提交于
Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=10556 where conn templates with protocol=IPPROTO_IP can oops backup box. Result from ip_vs_proto_get() should be checked because protocol value can be invalid or unsupported in backup. But for valid message we should not fail for templates which use IPPROTO_IP. Also, add checks to validate message limits and connection state. Show state NONE for templates using IPPROTO_IP. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-