1. 09 6月, 2011 2 次提交
    • E
      inetpeer: lower false sharing effect · 2b77bdde
      Eric Dumazet 提交于
      Profiles show false sharing in addr_compare() because refcnt/dtime
      changes dirty the first inet_peer cache line, where are lying the keys
      used at lookup time. If many cpus are calling inet_getpeer() and
      inet_putpeer(), or need frag ids, addr_compare() is in 2nd position in
      "perf top".
      
      Before patch, my udpflood bench (16 threads) on my 2x4x2 machine :
      
                   5784.00  9.7% csum_partial_copy_generic [kernel]
                   3356.00  5.6% addr_compare              [kernel]
                   2638.00  4.4% fib_table_lookup          [kernel]
                   2625.00  4.4% ip_fragment               [kernel]
                   1934.00  3.2% neigh_lookup              [kernel]
                   1617.00  2.7% udp_sendmsg               [kernel]
                   1608.00  2.7% __ip_route_output_key     [kernel]
                   1480.00  2.5% __ip_append_data          [kernel]
                   1396.00  2.3% kfree                     [kernel]
                   1195.00  2.0% kmem_cache_free           [kernel]
                   1157.00  1.9% inet_getpeer              [kernel]
                   1121.00  1.9% neigh_resolve_output      [kernel]
                   1012.00  1.7% dev_queue_xmit            [kernel]
      # time ./udpflood.sh
      
      real	0m44.511s
      user	0m20.020s
      sys	11m22.780s
      
      # time ./udpflood.sh
      
      real	0m44.099s
      user	0m20.140s
      sys	11m15.870s
      
      After patch, no more addr_compare() in profiles :
      
                   4171.00 10.7% csum_partial_copy_generic   [kernel]
                   1787.00  4.6% fib_table_lookup            [kernel]
                   1756.00  4.5% ip_fragment                 [kernel]
                   1234.00  3.2% udp_sendmsg                 [kernel]
                   1191.00  3.0% neigh_lookup                [kernel]
                   1118.00  2.9% __ip_append_data            [kernel]
                   1022.00  2.6% kfree                       [kernel]
                    993.00  2.5% __ip_route_output_key       [kernel]
                    841.00  2.2% neigh_resolve_output        [kernel]
                    816.00  2.1% kmem_cache_free             [kernel]
                    658.00  1.7% ia32_sysenter_target        [kernel]
                    632.00  1.6% kmem_cache_alloc_node       [kernel]
      
      # time ./udpflood.sh
      
      real	0m41.587s
      user	0m19.190s
      sys	10m36.370s
      
      # time ./udpflood.sh
      
      real	0m41.486s
      user	0m19.290s
      sys	10m33.650s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b77bdde
    • E
      inetpeer: remove unused list · 4b9d9be8
      Eric Dumazet 提交于
      Andi Kleen and Tim Chen reported huge contention on inetpeer
      unused_peers.lock, on memcached workload on a 40 core machine, with
      disabled route cache.
      
      It appears we constantly flip peers refcnt between 0 and 1 values, and
      we must insert/remove peers from unused_peers.list, holding a contended
      spinlock.
      
      Remove this list completely and perform a garbage collection on-the-fly,
      at lookup time, using the expired nodes we met during the tree
      traversal.
      
      This removes a lot of code, makes locking more standard, and obsoletes
      two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
      removes two pointers in inet_peer structure.
      
      There is still a false sharing effect because refcnt is in first cache
      line of object [were the links and keys used by lookups are located], we
      might move it at the end of inet_peer structure to let this first cache
      line mostly read by cpus.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: Tim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b9d9be8
  2. 23 4月, 2011 1 次提交
  3. 11 2月, 2011 2 次提交
  4. 05 2月, 2011 1 次提交
  5. 28 1月, 2011 2 次提交
  6. 02 12月, 2010 2 次提交
  7. 01 12月, 2010 3 次提交
  8. 28 10月, 2010 1 次提交
  9. 17 6月, 2010 1 次提交
    • E
      inetpeer: restore small inet_peer structures · 317fe0e6
      Eric Dumazet 提交于
      Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.
      
      Thats a bit unfortunate, since old size was exactly 64 bytes.
      
      This can be solved, using an union between this rcu_head an four fields,
      that are normally used only when a refcount is taken on inet_peer.
      rcu_head is used only when refcnt=-1, right before structure freeing.
      
      Add a inet_peer_refcheck() function to check this assertion for a while.
      
      We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      317fe0e6
  10. 16 6月, 2010 1 次提交
    • E
      inetpeer: RCU conversion · aa1039e7
      Eric Dumazet 提交于
      inetpeer currently uses an AVL tree protected by an rwlock.
      
      It's possible to make most lookups use RCU
      
      1) Add a struct rcu_head to struct inet_peer
      
      2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
      lookup. This is a normal function, not a macro like lookup().
      
      3) Add a limit to number of links followed by lookup_rcu_bh(). This is
      needed in case we fall in a loop.
      
      4) add an smp_wmb() in link_to_pool() right before node insert.
      
      5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
      last reference to an inet_peer, since lockless readers could increase
      refcount, even while we hold peers.lock.
      
      6) Delay struct inet_peer freeing after rcu grace period so that
      lookup_rcu_bh() cannot crash.
      
      7) inet_getpeer() first attempts lockless lookup.
         Note this lookup can fail even if target is in AVL tree, but a
      concurrent writer can let tree in a non correct form.
         If this attemps fails, lock is taken a regular lookup is performed
      again.
      
      8) convert peers.lock from rwlock to a spinlock
      
      9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
      rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
      128 bytes)
      In a future patch, this is probably possible to revert this part, if rcu
      field is put in an union to share space with rid, ip_id_count, tcp_ts &
      tcp_ts_stamp. These fields being manipulated only with refcnt > 0.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa1039e7
  11. 14 11月, 2009 1 次提交
    • E
      inetpeer: Optimize inet_getid() · 2c1409a0
      Eric Dumazet 提交于
      While investigating for network latencies, I found inet_getid() was a
      contention point for some workloads, as inet_peer_idlock is shared
      by all inet_getid() users regardless of peers.
      
      One way to fix this is to make ip_id_count an atomic_t instead
      of __u16, and use atomic_add_return().
      
      In order to keep sizeof(struct inet_peer) = 64 on 64bit arches
      tcp_ts_stamp is also converted to __u32 instead of "unsigned long".
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c1409a0
  12. 04 11月, 2009 1 次提交
  13. 12 6月, 2008 1 次提交
  14. 13 11月, 2007 1 次提交
  15. 20 10月, 2006 1 次提交
  16. 16 10月, 2006 1 次提交
  17. 29 9月, 2006 1 次提交
  18. 04 1月, 2006 1 次提交
  19. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4