1. 05 1月, 2010 1 次提交
    • C
      IPVS: Allow boot time change of hash size · 6f7edb48
      Catalin(ux) M. BOIE 提交于
      I was very frustrated about the fact that I have to recompile the kernel
      to change the hash size. So, I created this patch.
      
      If IPVS is built-in you can append ip_vs.conn_tab_bits=?? to kernel
      command line, or, if you built IPVS as modules, you can add
      options ip_vs conn_tab_bits=??.
      
      To keep everything backward compatible, you still can select the size at
      compile time, and that will be used as default.
      
      It has been about a year since this patch was originally posted
      and subsequently dropped on the basis of insufficient test data.
      
      Mark Bergsma has provided the following test results which seem
      to strongly support the need for larger hash table sizes:
      
      We do however run into the same problem with the default setting (212 =
      4096 entries), as most of our LVS balancers handle around a million
      connections/SLAB entries at any point in time (around 100-150 kpps
      load). With only 4096 hash table entries this implies that each entry
      consists of a linked list of 256 connections *on average*.
      
      To provide some statistics, I did an oprofile run on an 2.6.31 kernel,
      with both the default 4096 table size, and the same kernel recompiled
      with IP_VS_CONN_TAB_BITS set to 18 (218 = 262144 entries). I built a
      quick test setup with a part of Wikimedia/Wikipedia's live traffic
      mirrored by the switch to the test host.
      
      With the default setting, at ~ 120 kpps packet load we saw a typical %si
      CPU usage of around 30-35%, and oprofile reported a hot spot in
      ip_vs_conn_in_get:
      
      samples  %        image name               app name
      symbol name
      1719761  42.3741  ip_vs.ko                 ip_vs.ko      ip_vs_conn_in_get
      302577    7.4554  bnx2                     bnx2          /bnx2
      181984    4.4840  vmlinux                  vmlinux       __ticket_spin_lock
      128636    3.1695  vmlinux                  vmlinux       ip_route_input
      74345     1.8318  ip_vs.ko                 ip_vs.ko      ip_vs_conn_out_get
      68482     1.6874  vmlinux                  vmlinux       mwait_idle
      
      After loading the recompiled kernel with 218 entries, %si CPU usage
      dropped in half to around 12-18%, and oprofile looks much healthier,
      with only 7% spent in ip_vs_conn_in_get:
      
      samples  %        image name               app name
      symbol name
      265641   14.4616  bnx2                     bnx2         /bnx2
      143251    7.7986  vmlinux                  vmlinux      __ticket_spin_lock
      140661    7.6576  ip_vs.ko                 ip_vs.ko     ip_vs_conn_in_get
      94364     5.1372  vmlinux                  vmlinux      mwait_idle
      86267     4.6964  vmlinux                  vmlinux      ip_route_input
      
      [ horms@verge.net.au: trivial up-port and minor style fixes ]
      Signed-off-by: NCatalin(ux) M. BOIE <catab@embedromix.ro>
      Cc: Mark Bergsma <mark@wikimedia.org>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      6f7edb48
  2. 04 11月, 2009 1 次提交
  3. 06 8月, 2009 1 次提交
  4. 03 8月, 2009 1 次提交
  5. 31 7月, 2009 1 次提交
  6. 15 2月, 2009 1 次提交
  7. 20 11月, 2008 1 次提交
  8. 04 11月, 2008 1 次提交
  9. 31 10月, 2008 1 次提交
  10. 30 10月, 2008 1 次提交
  11. 29 10月, 2008 1 次提交
  12. 17 10月, 2008 1 次提交
  13. 01 10月, 2008 1 次提交
  14. 09 9月, 2008 2 次提交
  15. 05 9月, 2008 12 次提交
  16. 15 8月, 2008 1 次提交
  17. 11 8月, 2008 3 次提交
  18. 01 8月, 2008 1 次提交
    • J
      ipvs: Move userspace definitions to include/linux/ip_vs.h · bc4768eb
      Julius Volz 提交于
      Current versions of ipvsadm include "/usr/src/linux/include/net/ip_vs.h"
      directly. This file also contains kernel-only definitions. Normally, public
      definitions should live in include/linux, so this patch moves the
      definitions shared with userspace to a new file, "include/linux/ip_vs.h".
      
      This also removes the unused NFC_IPVS_PROPERTY bitmask, which was once
      used to point into skb->nfcache.
      
      To make old ipvsadms still compile with this, the old header file includes
      the new one.
      
      Thanks to Dave Miller and Horms for noting/adding the missing Kbuild entry
      for the new header file.
      Signed-off-by: NJulius Volz <juliusv@google.com>
      Acked-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc4768eb
  19. 29 4月, 2008 1 次提交
  20. 29 1月, 2008 2 次提交
  21. 20 11月, 2007 3 次提交
  22. 07 11月, 2007 2 次提交
    • R
      [IPVS]: Synchronize closing of Connections · efac5276
      Rumen G. Bogdanovski 提交于
      This patch makes the master daemon to sync the connection when it is about
      to close.  This makes the connections on the backup to close or timeout
      according their state.  Before the sync was performed only if the
      connection is in ESTABLISHED state which always made the connections to
      timeout in the hard coded 3 minutes. However the Andy Gospodarek's patch
      ([IPVS]: use proper timeout instead of fixed value) effectively did nothing
      more than increasing this to 15 minutes (Established state timeout).  So
      this patch makes use of proper timeout since it syncs the connections on
      status changes to FIN_WAIT (2min timeout) and CLOSE (10sec timeout).
      However if the backup misses CLOSE hopefully it did not miss FIN_WAIT.
      Otherwise we will just have to wait for the ESTABLISHED state timeout. As
      it is without this patch.  This way the number of the hanging connections
      on the backup is kept to minimum. And very few of them will be left to
      timeout with a long timeout.
      
      This is important if we want to make use of the fix for the real server
      overcommit on master/backup fail-over.
      Signed-off-by: NRumen G. Bogdanovski <rumen@voicecho.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efac5276
    • R
      [IPVS]: Bind connections on stanby if the destination exists · 1e356f9c
      Rumen G. Bogdanovski 提交于
      This patch fixes the problem with node overload on director fail-over.
      Given the scenario: 2 nodes each accepting 3 connections at a time and 2
      directors, director failover occurs when the nodes are fully loaded (6
      connections to the cluster) in this case the new director will assign
      another 6 connections to the cluster, If the same real servers exist
      there.
      
      The problem turned to be in not binding the inherited connections to
      the real servers (destinations) on the backup director. Therefore:
      "ipvsadm -l" reports 0 connections:
      root@test2:~# ipvsadm -l
      IP Virtual Server version 1.2.1 (size=4096)
      Prot LocalAddress:Port Scheduler Flags
        -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
      TCP  test2.local:5999 wlc
        -> node473.local:5999           Route   1000   0          0
        -> node484.local:5999           Route   1000   0          0
      
      while "ipvs -lnc" is right
      root@test2:~# ipvsadm -lnc
      IPVS connection entries
      pro expire state       source             virtual            destination
      TCP 14:56  ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999
      192.168.0.51:5999
      TCP 14:59  ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999
      192.168.0.52:5999
      
      So the patch I am sending fixes the problem by binding the received
      connections to the appropriate service on the backup director, if it
      exists, else the connection will be handled the old way. So if the
      master and the backup directors are synchronized in terms of real
      services there will be no problem with server over-committing since
      new connections will not be created on the nonexistent real services
      on the backup. However if the service is created later on the backup,
      the binding will be performed when the next connection update is
      received. With this patch the inherited connections will show as
      inactive on the backup:
      
      root@test2:~# ipvsadm -l
      IP Virtual Server version 1.2.1 (size=4096)
      Prot LocalAddress:Port Scheduler Flags
        -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
      TCP  test2.local:5999 wlc
        -> node473.local:5999           Route   1000   0          1
        -> node484.local:5999           Route   1000   0          1
      
      rumen@test2:~$ cat /proc/net/ip_vs
      IP Virtual Server version 1.2.1 (size=4096)
      Prot LocalAddress:Port Scheduler Flags
        -> RemoteAddress:Port Forward Weight ActiveConn InActConn
      TCP  C0A800DE:176F wlc
        -> C0A80033:176F      Route   1000   0          1
        -> C0A80032:176F      Route   1000   0          1
      
      Regards,
      Rumen Bogdanovski
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NRumen G. Bogdanovski <rumen@voicecho.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      1e356f9c