1. 10 1月, 2014 14 次提交
  2. 08 1月, 2014 12 次提交
    • P
      netfilter: nft_ct: load both IPv4 and IPv6 conntrack modules for NFPROTO_INET · 9638f33e
      Patrick McHardy 提交于
      The ct expression can currently not be used in the inet family since
      we don't have a conntrack module for NFPROTO_INET, so
      nf_ct_l3proto_try_module_get() fails. Add some manual handling to
      load the modules for both NFPROTO_IPV4 and NFPROTO_IPV6 if the
      ct expression is used in the inet family.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9638f33e
    • P
      netfilter: nft_meta: add l4proto support · 4566bf27
      Patrick McHardy 提交于
      For L3-proto independant rules we need to get at the L4 protocol value
      directly. Add it to the nft_pktinfo struct and use the meta expression
      to retrieve it.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4566bf27
    • P
      netfilter: nf_tables: add nfproto support to meta expression · 124edfa9
      Patrick McHardy 提交于
      Needed by multi-family tables to distinguish IPv4 and IPv6 packets.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      124edfa9
    • P
      netfilter: nf_tables: add "inet" table for IPv4/IPv6 · 1d49144c
      Patrick McHardy 提交于
      This patch adds a new table family and a new filter chain that you can
      use to attach IPv4 and IPv6 rules. This should help to simplify
      rule-set maintainance in dual-stack setups.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1d49144c
    • P
      netfilter: nf_tables: add support for multi family tables · 115a60b1
      Patrick McHardy 提交于
      Add support to register chains to multiple hooks for different address
      families for mixed IPv4/IPv6 tables.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      115a60b1
    • P
      netfilter: nf_tables: add hook ops to struct nft_pktinfo · c9484874
      Patrick McHardy 提交于
      Multi-family tables need the AF from the hook ops. Add a pointer to the
      hook ops and replace usage of the hooknum member in struct nft_pktinfo.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c9484874
    • P
      netfilter: nf_tables: make chain types override the default AF functions · 3b088c4b
      Patrick McHardy 提交于
      Currently the AF-specific hook functions override the chain-type specific
      hook functions. That doesn't make too much sense since the chain types
      are a special case of the AF-specific hooks.
      
      Make the AF-specific hook functions the default and make the optional
      chain type hooks override them.
      
      As a side effect, the necessary code restructuring reduces the code size,
      f.i. in case of nf_tables_ipv4.o:
      
        nf_tables_ipv4_init_net   |  -24
        nft_do_chain_ipv4         | -113
       2 functions changed, 137 bytes removed, diff: -137
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3b088c4b
    • P
      netfilter: nft_reject: fix compilation warning if NF_TABLES_IPV6 is disabled · 688d1863
      Pablo Neira Ayuso 提交于
      net/netfilter/nft_reject.c: In function 'nft_reject_eval':
      net/netfilter/nft_reject.c:37:14: warning: unused variable 'net' [-Wunused-variable]
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      688d1863
    • B
      net: Do not enable tx-nocache-copy by default · cdb3f4a3
      Benjamin Poirier 提交于
      There are many cases where this feature does not improve performance or even
      reduces it.
      
      For example, here are the results from tests that I've run using 3.12.6 on one
      Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
      from the Xeon, but they're similar on the i7. All numbers report the
      mean±stddev over 10 runs of 10s.
      
      1) latency tests similar to what is described in "c6e1a0d1 net: Allow no-cache
      copy from user on transmit"
      There is no statistically significant difference between tx-nocache-copy
      on/off.
      nic irqs spread out (one queue per cpu)
      
      200x netperf -r 1400,1
      tx-nocache-copy off
              692000±1000 tps
              50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
      tx-nocache-copy on
              693000±1000 tps
              50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7
      
      200x netperf -r 14000,14000
      tx-nocache-copy off
              86450±80 tps
              50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
      tx-nocache-copy on
              86110±60 tps
              50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20
      
      2) single stream throughput tests
      tx-nocache-copy leads to higher service demand
      
                              throughput  cpu0        cpu1        demand
                              (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)
      
      nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)
      
      tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
      tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004
      
      nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)
      
      tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
      tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002
      
      As a second example, here are some results from Eric Dumazet with latest
      net-next.
      tx-nocache-copy also leads to higher service demand
      
      (cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)
      
      lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
      lpq83:~# perf stat ./netperf -H lpq84 -c
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
      
       87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000
      
       Performance counter stats for './netperf -H lpq84 -c':
      
             4282.648396 task-clock                #    0.423 CPUs utilized
                   9,348 context-switches          #    0.002 M/sec
                      88 CPU-migrations            #    0.021 K/sec
                     355 page-faults               #    0.083 K/sec
          11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
           9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
           4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
           6,053,172,792 instructions              #    0.51  insns per cycle
                                                   #    1.49  stalled cycles per insn [83.64%]
             597,275,583 branches                  #  139.464 M/sec                   [83.70%]
               8,960,541 branch-misses             #    1.50% of all branches         [83.65%]
      
            10.128990264 seconds time elapsed
      
      lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
      lpq83:~# perf stat ./netperf -H lpq84 -c
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
      
       87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000
      
       Performance counter stats for './netperf -H lpq84 -c':
      
             2847.375441 task-clock                #    0.281 CPUs utilized
                  11,632 context-switches          #    0.004 M/sec
                      49 CPU-migrations            #    0.017 K/sec
                     354 page-faults               #    0.124 K/sec
           7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
           6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
           1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
           2,079,702,453 instructions              #    0.27  insns per cycle
                                                   #    2.94  stalled cycles per insn [83.22%]
             363,773,213 branches                  #  127.757 M/sec                   [83.29%]
               4,242,732 branch-misses             #    1.17% of all branches         [83.51%]
      
            10.128449949 seconds time elapsed
      
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: NBenjamin Poirier <bpoirier@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdb3f4a3
    • J
      ipv4: loopback device: ignore value changes after device is upped · dfd1582d
      Jiri Pirko 提交于
      When lo is brought up, new ifa is created. Then, devconf and neigh values
      bitfield should be set so later changes of default values would not
      affect lo values.
      
      Note that the same behaviour is in ipv6. Also note that this is likely
      not an issue in many distros (for example Fedora 19) because userspace
      sets address to lo manually before bringing it up.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfd1582d
    • F
      IPv6: add the option to use anycast addresses as source addresses in echo reply · 509aba3b
      FX Le Bail 提交于
      This change allows to follow a recommandation of RFC4942.
      
      - Add "anycast_src_echo_reply" sysctl to control the use of anycast addresses
        as source addresses for ICMPv6 echo reply. This sysctl is false by default
        to preserve existing behavior.
      - Add inline check ipv6_anycast_destination().
      - Use them in icmpv6_echo_reply().
      
      Reference:
      RFC4942 - IPv6 Transition/Coexistence Security Considerations
         (http://tools.ietf.org/html/rfc4942#section-2.1.6)
      
      2.1.6. Anycast Traffic Identification and Security
      
         [...]
         To avoid exposing knowledge about the internal structure of the
         network, it is recommended that anycast servers now take advantage of
         the ability to return responses with the anycast address as the
         source address if possible.
      Signed-off-by: NFrancois-Xavier Le Bail <fx.lebail@yahoo.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      509aba3b
    • W
      net/mlx4_en: fix error return code in mlx4_en_get_qp() · 9ba75fb0
      Wei Yongjun 提交于
      Fix to return a negative error code from the error handling
      case instead of 0.
      
      Fixes: 837052d0 ('net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling')
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ba75fb0
  3. 07 1月, 2014 14 次提交