1. 28 10月, 2010 3 次提交
  2. 27 10月, 2010 3 次提交
  3. 26 10月, 2010 3 次提交
  4. 25 10月, 2010 2 次提交
  5. 21 10月, 2010 7 次提交
  6. 18 10月, 2010 1 次提交
  7. 17 10月, 2010 2 次提交
  8. 14 10月, 2010 3 次提交
  9. 12 10月, 2010 1 次提交
    • E
      net dst: use a percpu_counter to track entries · fc66f95c
      Eric Dumazet 提交于
      struct dst_ops tracks number of allocated dst in an atomic_t field,
      subject to high cache line contention in stress workload.
      
      Switch to a percpu_counter, to reduce number of time we need to dirty a
      central location. Place it on a separate cache line to avoid dirtying
      read only fields.
      
      Stress test :
      
      (Sending 160.000.000 UDP frames,
      IP route cache disabled, dual E5540 @2.53GHz,
      32bit kernel, FIB_TRIE, SLUB/NUMA)
      
      Before:
      
      real    0m51.179s
      user    0m15.329s
      sys     10m15.942s
      
      After:
      
      real	0m45.570s
      user	0m15.525s
      sys	9m56.669s
      
      With a small reordering of struct neighbour fields, subject of a
      following patch, (to separate refcnt from other read mostly fields)
      
      real	0m41.841s
      user	0m15.261s
      sys	8m45.949s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc66f95c
  10. 06 10月, 2010 2 次提交
    • E
      net neigh: RCU conversion of neigh hash table · d6bf7817
      Eric Dumazet 提交于
      David
      
      This is the first step for RCU conversion of neigh code.
      
      Next patches will convert hash_buckets[] and "struct neighbour" to RCU
      protected objects.
      
      Thanks
      
      [PATCH net-next] net neigh: RCU conversion of neigh hash table
      
      Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
      neigh_table", a new structure is defined :
      
      struct neigh_hash_table {
             struct neighbour        **hash_buckets;
             unsigned int            hash_mask;
             __u32                   hash_rnd;
             struct rcu_head         rcu;
      };
      
      And "struct neigh_table" has an RCU protected pointer to such a
      neigh_hash_table.
      
      This means the signature of (*hash)() function changed: We need to add a
      third parameter with the actual hash_rnd value, since this is not
      anymore a neigh_table field.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6bf7817
    • E
      net: add a core netdev->rx_dropped counter · caf586e5
      Eric Dumazet 提交于
      In various situations, a device provides a packet to our stack and we
      drop it before it enters protocol stack :
      - softnet backlog full (accounted in /proc/net/softnet_stat)
      - bad vlan tag (not accounted)
      - unknown/unregistered protocol (not accounted)
      
      We can handle a per-device counter of such dropped frames at core level,
      and automatically adds it to the device provided stats (rx_dropped), so
      that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
      
      This is a generalization of commit 8990f468 (net: rx_dropped
      accounting), thus reverting it.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caf586e5
  11. 05 10月, 2010 2 次提交
  12. 04 10月, 2010 1 次提交
  13. 30 9月, 2010 3 次提交
    • E
      ip6tnl: percpu stats accounting · 8560f226
      Eric Dumazet 提交于
      Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets.
      
      Other seldom used fields are kept in netdev->stats structure, possibly
      unsafe.
      
      This is a preliminary work to support lockless transmit path, and
      correct RX stats, that are already unsafe.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8560f226
    • E
      sit: enable lockless xmits · 8df40d10
      Eric Dumazet 提交于
      SIT tunnels can benefit from lockless xmits, using NETIF_F_LLTX
      
      Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
      10000000 UDP frames via one sit tunnel (size:220 bytes per frame)
      
      Before patch :
      
      real	3m15.399s
      user	0m9.185s
      sys	51m55.403s
      
      75029.00 87.5% _raw_spin_lock            vmlinux
       1090.00  1.3% dst_release               vmlinux
        902.00  1.1% dev_queue_xmit            vmlinux
        627.00  0.7% sock_wfree                vmlinux
        613.00  0.7% ip6_push_pending_frames   ipv6.ko
        505.00  0.6% __ip_route_output_key     vmlinux
      
      After patch:
      
      real	1m1.387s
      user	0m12.489s
      sys	15m58.868s
      
      28239.00 23.3% dst_release               vmlinux
      13570.00 11.2% ip6_push_pending_frames   ipv6.ko
      13118.00 10.8% ip6_append_data           ipv6.ko
       7995.00  6.6% __ip_route_output_key     vmlinux
       7924.00  6.5% sk_dst_check              vmlinux
       5015.00  4.1% udpv6_sendmsg             ipv6.ko
       3594.00  3.0% sock_alloc_send_pskb      vmlinux
       3135.00  2.6% sock_wfree                vmlinux
       3055.00  2.5% ip6_sk_dst_lookup         ipv6.ko
       2473.00  2.0% ip_finish_output          vmlinux
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8df40d10
    • E
      sit: fix percpu stats accounting · dd4080ee
      Eric Dumazet 提交于
      commit 15fc1f70 (sit: percpu stats accounting) forgot the fallback
      tunnel case (sit0), and can crash pretty fast.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd4080ee
  14. 29 9月, 2010 1 次提交
    • M
      ipv6: Implement Any-IP support for IPv6. · ab79ad14
      Maciej Żenczykowski 提交于
      AnyIP is the capability to receive packets and establish incoming
      connections on IPs we have not explicitly configured on the machine.
      
      An example use case is to configure a machine to accept all incoming
      traffic on eth0, and leave the policy of whether traffic for a given IP
      should be delivered to the machine up to the load balancer.
      
      Can be setup as follows:
        ip -6 rule from all iif eth0 lookup 200
        ip -6 route add local default dev lo table 200
      (in this case for all IPv6 addresses)
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab79ad14
  15. 28 9月, 2010 2 次提交
  16. 27 9月, 2010 1 次提交
    • N
      ipv6: add a missing unregister_pernet_subsys call · 2cc6d2bf
      Neil Horman 提交于
      Clean up a missing exit path in the ipv6 module init routines.  In
      addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
      for the ipv6_addr_label_ops structure.  But if module loading fails, or if the
      ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
      which leaves a now-bogus address on the pernet_list, leading to oopses in
      subsequent registrations.  This patch cleans up both the failed load path and
      the unload path.  Tested by myself with good results.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      
       include/net/addrconf.h |    1 +
       net/ipv6/addrconf.c    |   11 ++++++++---
       net/ipv6/addrlabel.c   |    5 +++++
       3 files changed, 14 insertions(+), 3 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cc6d2bf
  17. 24 9月, 2010 1 次提交
  18. 22 9月, 2010 1 次提交
    • E
      ip: fix truesize mismatch in ip fragmentation · 3d13008e
      Eric Dumazet 提交于
      Special care should be taken when slow path is hit in ip_fragment() :
      
      When walking through frags, we transfert truesize ownership from skb to
      frags. Then if we hit a slow_path condition, we must undo this or risk
      uncharging frags->truesize twice, and in the end, having negative socket
      sk_wmem_alloc counter, or even freeing socket sooner than expected.
      
      Many thanks to Nick Bowler, who provided a very clean bug report and
      test program.
      
      Thanks to Jarek for reviewing my first patch and providing a V2
      
      While Nick bisection pointed to commit 2b85a34e (net: No more
      expensive sock_hold()/sock_put() on each tx), underlying bug is older
      (2.6.12-rc5)
      
      A side effect is to extend work done in commit b2722b1c
      (ip_fragment: also adjust skb->truesize for packets not owned by a
      socket) to ipv6 as well.
      Reported-and-bisected-by: NNick Bowler <nbowler@elliptictech.com>
      Tested-by: NNick Bowler <nbowler@elliptictech.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d13008e
  19. 21 9月, 2010 1 次提交
    • T
      xfrm: Allow different selector family in temporary state · 8444cf71
      Thomas Egerer 提交于
      The family parameter xfrm_state_find is used to find a state matching a
      certain policy. This value is set to the template's family
      (encap_family) right before xfrm_state_find is called.
      The family parameter is however also used to construct a temporary state
      in xfrm_state_find itself which is wrong for inter-family scenarios
      because it produces a selector for the wrong family. Since this selector
      is included in the xfrm_user_acquire structure, user space programs
      misinterpret IPv6 addresses as IPv4 and vice versa.
      This patch splits up the original init_tempsel function into a part that
      initializes the selector respectively the props and id of the temporary
      state, to allow for differing ip address families whithin the state.
      Signed-off-by: NThomas Egerer <thomas.egerer@secunet.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8444cf71