1. 10 6月, 2012 2 次提交
  2. 09 6月, 2012 1 次提交
  3. 07 6月, 2012 1 次提交
  4. 08 3月, 2012 2 次提交
  5. 18 1月, 2012 1 次提交
  6. 17 1月, 2012 1 次提交
  7. 07 8月, 2011 1 次提交
    • D
      net: Compute protocol sequence numbers and fragment IDs using MD5. · 6e5714ea
      David S. Miller 提交于
      Computers have become a lot faster since we compromised on the
      partial MD4 hash which we use currently for performance reasons.
      
      MD5 is a much safer choice, and is inline with both RFC1948 and
      other ISS generators (OpenBSD, Solaris, etc.)
      
      Furthermore, only having 24-bits of the sequence number be truly
      unpredictable is a very serious limitation.  So the periodic
      regeneration and 8-bit counter have been removed.  We compute and
      use a full 32-bit sequence number.
      
      For ipv6, DCCP was found to use a 32-bit truncated initial sequence
      number (it needs 43-bits) and that is fixed here as well.
      Reported-by: NDan Kaminsky <dan@doxpara.com>
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e5714ea
  8. 22 7月, 2011 1 次提交
  9. 12 7月, 2011 1 次提交
    • E
      inetpeer: kill inet_putpeer race · 6d1a3e04
      Eric Dumazet 提交于
      We currently can free inetpeer entries too early :
      
      [  782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
      [  782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
      [  782.636686]  i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
      [  782.636694]                          ^
      [  782.636696]
      [  782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
      [  782.636702] EIP: 0060:[<c13fefbb>] EFLAGS: 00010286 CPU: 0
      [  782.636707] EIP is at inet_getpeer+0x25b/0x5a0
      [  782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
      [  782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
      [  782.636712]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
      [  782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
      [  782.636717] DR6: ffff4ff0 DR7: 00000400
      [  782.636718]  [<c13fbf76>] rt_set_nexthop.clone.45+0x56/0x220
      [  782.636722]  [<c13fc449>] __ip_route_output_key+0x309/0x860
      [  782.636724]  [<c141dc54>] tcp_v4_connect+0x124/0x450
      [  782.636728]  [<c142ce43>] inet_stream_connect+0xa3/0x270
      [  782.636731]  [<c13a8da1>] sys_connect+0xa1/0xb0
      [  782.636733]  [<c13a99dd>] sys_socketcall+0x25d/0x2a0
      [  782.636736]  [<c149deb8>] sysenter_do_call+0x12/0x28
      [  782.636738]  [<ffffffff>] 0xffffffff
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d1a3e04
  10. 09 6月, 2011 1 次提交
    • E
      inetpeer: remove unused list · 4b9d9be8
      Eric Dumazet 提交于
      Andi Kleen and Tim Chen reported huge contention on inetpeer
      unused_peers.lock, on memcached workload on a 40 core machine, with
      disabled route cache.
      
      It appears we constantly flip peers refcnt between 0 and 1 values, and
      we must insert/remove peers from unused_peers.list, holding a contended
      spinlock.
      
      Remove this list completely and perform a garbage collection on-the-fly,
      at lookup time, using the expired nodes we met during the tree
      traversal.
      
      This removes a lot of code, makes locking more standard, and obsoletes
      two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
      removes two pointers in inet_peer structure.
      
      There is still a false sharing effect because refcnt is in first cache
      line of object [were the links and keys used by lookups are located], we
      might move it at the end of inet_peer structure to let this first cache
      line mostly read by cpus.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: Tim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b9d9be8
  11. 28 5月, 2011 1 次提交
  12. 13 4月, 2011 1 次提交
    • E
      inetpeer: reduce stack usage · 66944e1c
      Eric Dumazet 提交于
      On 64bit arches, we use 752 bytes of stack when cleanup_once() is called
      from inet_getpeer().
      
      Lets share the avl stack to save ~376 bytes.
      
      Before patch :
      
      # objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
      
      0x000006c3 unlink_from_pool [inetpeer.o]:		376
      0x00000721 unlink_from_pool [inetpeer.o]:		376
      0x00000cb1 inet_getpeer [inetpeer.o]:			376
      0x00000e6d inet_getpeer [inetpeer.o]:			376
      0x0004 inet_initpeers [inetpeer.o]:			112
      # size net/ipv4/inetpeer.o
         text	   data	    bss	    dec	    hex	filename
         5320	    432	     21	   5773	   168d	net/ipv4/inetpeer.o
      
      After patch :
      
      objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
      0x00000c11 inet_getpeer [inetpeer.o]:			376
      0x00000dcd inet_getpeer [inetpeer.o]:			376
      0x00000ab9 peer_check_expire [inetpeer.o]:		328
      0x00000b7f peer_check_expire [inetpeer.o]:		328
      0x0004 inet_initpeers [inetpeer.o]:			112
      # size net/ipv4/inetpeer.o
         text	   data	    bss	    dec	    hex	filename
         5163	    432	     21	   5616	   15f0	net/ipv4/inetpeer.o
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Scot Doyle <lkml@scotdoyle.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Reviewed-by: NHiroaki SHIMODA <shimoda.hiroaki@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66944e1c
  13. 14 3月, 2011 2 次提交
  14. 09 3月, 2011 1 次提交
  15. 05 3月, 2011 1 次提交
    • E
      inetpeer: seqlock optimization · 65e8354e
      Eric Dumazet 提交于
      David noticed :
      
      ------------------
      Eric, I was profiling the non-routing-cache case and something that
      stuck out is the case of calling inet_getpeer() with create==0.
      
      If an entry is not found, we have to redo the lookup under a spinlock
      to make certain that a concurrent writer rebalancing the tree does
      not "hide" an existing entry from us.
      
      This makes the case of a create==0 lookup for a not-present entry
      really expensive.  It is on the order of 600 cpu cycles on my
      Niagara2.
      
      I added a hack to not do the relookup under the lock when create==0
      and it now costs less than 300 cycles.
      
      This is now a pretty common operation with the way we handle COW'd
      metrics, so I think it's definitely worth optimizing.
      -----------------
      
      One solution is to use a seqlock instead of a spinlock to protect struct
      inet_peer_base.
      
      After a failed avl tree lookup, we can easily detect if a writer did
      some changes during our lookup. Taking the lock and redo the lookup is
      only necessary in this case.
      
      Note: Add one private rcu_deref_locked() macro to place in one spot the
      access to spinlock included in seqlock.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65e8354e
  16. 11 2月, 2011 2 次提交
  17. 05 2月, 2011 1 次提交
  18. 28 1月, 2011 1 次提交
  19. 25 1月, 2011 1 次提交
  20. 02 12月, 2010 1 次提交
  21. 01 12月, 2010 6 次提交
  22. 28 10月, 2010 1 次提交
  23. 17 6月, 2010 1 次提交
    • E
      inetpeer: restore small inet_peer structures · 317fe0e6
      Eric Dumazet 提交于
      Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.
      
      Thats a bit unfortunate, since old size was exactly 64 bytes.
      
      This can be solved, using an union between this rcu_head an four fields,
      that are normally used only when a refcount is taken on inet_peer.
      rcu_head is used only when refcnt=-1, right before structure freeing.
      
      Add a inet_peer_refcheck() function to check this assertion for a while.
      
      We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      317fe0e6
  24. 16 6月, 2010 2 次提交
    • E
      inetpeer: do not use zero refcnt for freed entries · 5f2f8920
      Eric Dumazet 提交于
      Followup of commit aa1039e7 (inetpeer: RCU conversion)
      
      Unused inet_peer entries have a null refcnt.
      
      Using atomic_inc_not_zero() in rcu lookups is not going to work for
      them, and slow path is taken.
      
      Fix this using -1 marker instead of 0 for deleted entries.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2f8920
    • E
      inetpeer: RCU conversion · aa1039e7
      Eric Dumazet 提交于
      inetpeer currently uses an AVL tree protected by an rwlock.
      
      It's possible to make most lookups use RCU
      
      1) Add a struct rcu_head to struct inet_peer
      
      2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
      lookup. This is a normal function, not a macro like lookup().
      
      3) Add a limit to number of links followed by lookup_rcu_bh(). This is
      needed in case we fall in a loop.
      
      4) add an smp_wmb() in link_to_pool() right before node insert.
      
      5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
      last reference to an inet_peer, since lockless readers could increase
      refcount, even while we hold peers.lock.
      
      6) Delay struct inet_peer freeing after rcu grace period so that
      lookup_rcu_bh() cannot crash.
      
      7) inet_getpeer() first attempts lockless lookup.
         Note this lookup can fail even if target is in AVL tree, but a
      concurrent writer can let tree in a non correct form.
         If this attemps fails, lock is taken a regular lookup is performed
      again.
      
      8) convert peers.lock from rwlock to a spinlock
      
      9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
      rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
      128 bytes)
      In a future patch, this is probably possible to revert this part, if rcu
      field is put in an union to share space with rid, ip_id_count, tcp_ts &
      tcp_ts_stamp. These fields being manipulated only with refcnt > 0.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa1039e7
  25. 15 6月, 2010 1 次提交
    • E
      inetpeer: various changes · d6cc1d64
      Eric Dumazet 提交于
      Try to reduce cache line contentions in peer management, to reduce IP
      defragmentation overhead.
      
      - peer_fake_node is marked 'const' to make sure its not modified.
        (tested with CONFIG_DEBUG_RODATA=y)
      
      - Group variables in two structures to reduce number of dirtied cache
      lines. One named "peers" for avl tree root, its number of entries, and
      associated lock. (candidate for RCU conversion)
      
      - A second one named "unused_peers" for unused list and its lock
      
      - Add a !list_empty() test in unlink_from_unused() to avoid taking lock
      when entry is not unused.
      
      - Use atomic_dec_and_lock() in inet_putpeer() to avoid taking lock in
      some cases.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6cc1d64
  26. 14 11月, 2009 1 次提交
    • E
      inetpeer: Optimize inet_getid() · 2c1409a0
      Eric Dumazet 提交于
      While investigating for network latencies, I found inet_getid() was a
      contention point for some workloads, as inet_peer_idlock is shared
      by all inet_getid() users regardless of peers.
      
      One way to fix this is to make ip_id_count an atomic_t instead
      of __u16, and use atomic_add_return().
      
      In order to keep sizeof(struct inet_peer) = 64 on 64bit arches
      tcp_ts_stamp is also converted to __u32 instead of "unsigned long".
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c1409a0
  27. 03 11月, 2008 1 次提交
  28. 12 6月, 2008 1 次提交
  29. 13 11月, 2007 1 次提交
  30. 21 7月, 2007 1 次提交
    • P
      [IPV4]: Fix inetpeer gcc-4.2 warnings · fc7b9380
      Patrick McHardy 提交于
        CC      net/ipv4/inetpeer.o
      net/ipv4/inetpeer.c: In function 'unlink_from_pool':
      net/ipv4/inetpeer.c:297: warning: the address of 'stack' will always evaluate as 'true'
      net/ipv4/inetpeer.c:297: warning: the address of 'stack' will always evaluate as 'true'
      net/ipv4/inetpeer.c: In function 'inet_getpeer':
      net/ipv4/inetpeer.c:409: warning: the address of 'stack' will always evaluate as 'true'
      net/ipv4/inetpeer.c:409: warning: the address of 'stack' will always evaluate as 'true'
      
      "Fix" by checking for != NULL.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc7b9380