1. 05 3月, 2011 4 次提交
    • D
      ipv4: Optimize flow initialization in output route lookup. · 44713b67
      David S. Miller 提交于
      We burn a lot of useless cycles, cpu store buffer traffic, and
      memory operations memset()'ing the on-stack flow used to perform
      output route lookups in __ip_route_output_key().
      
      Only the first half of the flow object members even matter for
      output route lookups in this context, specifically:
      
      FIB rules matching cares about:
      
      	dst, src, tos, iif, oif, mark
      
      FIB trie lookup cares about:
      
      	dst
      
      FIB semantic match cares about:
      
      	tos, scope, oif
      
      Therefore only initialize these specific members and elide the
      memset entirely.
      
      On Niagara2 this kills about ~300 cycles from the output route
      lookup path.
      
      Likely, we can take things further, since all callers of output
      route lookups essentially throw away the on-stack flow they use.
      So they don't care if we use it as a scratch-pad to compute the
      final flow key.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      44713b67
    • E
      inetpeer: seqlock optimization · 65e8354e
      Eric Dumazet 提交于
      David noticed :
      
      ------------------
      Eric, I was profiling the non-routing-cache case and something that
      stuck out is the case of calling inet_getpeer() with create==0.
      
      If an entry is not found, we have to redo the lookup under a spinlock
      to make certain that a concurrent writer rebalancing the tree does
      not "hide" an existing entry from us.
      
      This makes the case of a create==0 lookup for a not-present entry
      really expensive.  It is on the order of 600 cpu cycles on my
      Niagara2.
      
      I added a hack to not do the relookup under the lock when create==0
      and it now costs less than 300 cycles.
      
      This is now a pretty common operation with the way we handle COW'd
      metrics, so I think it's definitely worth optimizing.
      -----------------
      
      One solution is to use a seqlock instead of a spinlock to protect struct
      inet_peer_base.
      
      After a failed avl tree lookup, we can easily detect if a writer did
      some changes during our lookup. Taking the lock and redo the lookup is
      only necessary in this case.
      
      Note: Add one private rcu_deref_locked() macro to place in one spot the
      access to spinlock included in seqlock.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65e8354e
    • D
      d72751ed
    • J
      Merge branch 'master' of... · 85a7045a
      John W. Linville 提交于
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
      85a7045a
  2. 04 3月, 2011 26 次提交
  3. 03 3月, 2011 10 次提交