1. 17 11月, 2011 1 次提交
  2. 15 11月, 2011 1 次提交
  3. 14 11月, 2011 1 次提交
    • E
      neigh: new unresolved queue limits · 8b5c171b
      Eric Dumazet 提交于
      Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit :
      > From: David Miller <davem@davemloft.net>
      > Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST)
      >
      > > From: Eric Dumazet <eric.dumazet@gmail.com>
      > > Date: Wed, 09 Nov 2011 12:14:09 +0100
      > >
      > >> unres_qlen is the number of frames we are able to queue per unresolved
      > >> neighbour. Its default value (3) was never changed and is responsible
      > >> for strange drops, especially if IP fragments are used, or multiple
      > >> sessions start in parallel. Even a single tcp flow can hit this limit.
      > >  ...
      > >
      > > Ok, I've applied this, let's see what happens :-)
      >
      > Early answer, build fails.
      >
      > Please test build this patch with DECNET enabled and resubmit.  The
      > decnet neigh layer still refers to the removed ->queue_len member.
      >
      > Thanks.
      
      Ouch, this was fixed on one machine yesterday, but not the other one I
      used this morning, sorry.
      
      [PATCH V5 net-next] neigh: new unresolved queue limits
      
      unres_qlen is the number of frames we are able to queue per unresolved
      neighbour. Its default value (3) was never changed and is responsible
      for strange drops, especially if IP fragments are used, or multiple
      sessions start in parallel. Even a single tcp flow can hit this limit.
      
      $ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108
      PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data.
      8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b5c171b
  4. 10 11月, 2011 2 次提交
    • E
      ipv4: PKTINFO doesnt need dst reference · d826eb14
      Eric Dumazet 提交于
      Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :
      
      > At least, in recent kernels we dont change dst->refcnt in forwarding
      > patch (usinf NOREF skb->dst)
      >
      > One particular point is the atomic_inc(dst->refcnt) we have to perform
      > when queuing an UDP packet if socket asked PKTINFO stuff (for example a
      > typical DNS server has to setup this option)
      >
      > I have one patch somewhere that stores the information in skb->cb[] and
      > avoid the atomic_{inc|dec}(dst->refcnt).
      >
      
      OK I found it, I did some extra tests and believe its ready.
      
      [PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference
      
      When a socket uses IP_PKTINFO notifications, we currently force a dst
      reference for each received skb. Reader has to access dst to get needed
      information (rt_iif & rt_spec_dst) and must release dst reference.
      
      We also forced a dst reference if skb was put in socket backlog, even
      without IP_PKTINFO handling. This happens under stress/load.
      
      We can instead store the needed information in skb->cb[], so that only
      softirq handler really access dst, improving cache hit ratios.
      
      This removes two atomic operations per packet, and false sharing as
      well.
      
      On a benchmark using a mono threaded receiver (doing only recvmsg()
      calls), I can reach 720.000 pps instead of 570.000 pps.
      
      IP_PKTINFO is typically used by DNS servers, and any multihomed aware
      UDP application.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d826eb14
    • E
      ipv4: reduce percpu needs for icmpmsg mibs · acb32ba3
      Eric Dumazet 提交于
      Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
      (can be ~88000 us).
      
      This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
      for all possible cpus can read 16 Mbytes of memory.
      
      ICMP messages are not considered as fast path on a typical server, and
      eventually few cpus handle them anyway. We can afford an atomic
      operation instead of using percpu data.
      
      This saves 4096 bytes per cpu and per network namespace.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acb32ba3
  5. 09 11月, 2011 4 次提交
  6. 04 11月, 2011 1 次提交
  7. 02 11月, 2011 2 次提交
  8. 01 11月, 2011 3 次提交
  9. 27 10月, 2011 1 次提交
  10. 25 10月, 2011 2 次提交
  11. 24 10月, 2011 5 次提交
  12. 22 10月, 2011 1 次提交
  13. 21 10月, 2011 5 次提交
  14. 20 10月, 2011 2 次提交
  15. 19 10月, 2011 3 次提交
  16. 14 10月, 2011 1 次提交
    • E
      net: more accurate skb truesize · 87fb4b7b
      Eric Dumazet 提交于
      skb truesize currently accounts for sk_buff struct and part of skb head.
      kmalloc() roundings are also ignored.
      
      Considering that skb_shared_info is larger than sk_buff, its time to
      take it into account for better memory accounting.
      
      This patch introduces SKB_TRUESIZE(X) macro to centralize various
      assumptions into a single place.
      
      At skb alloc phase, we put skb_shared_info struct at the exact end of
      skb head, to allow a better use of memory (lowering number of
      reallocations), since kmalloc() gives us power-of-two memory blocks.
      
      Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
      aligned to cache lines, as before.
      
      Note: This patch might trigger performance regressions because of
      misconfigured protocol stacks, hitting per socket or global memory
      limits that were previously not reached. But its a necessary step for a
      more accurate memory accounting.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <ak@linux.intel.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87fb4b7b
  17. 13 10月, 2011 1 次提交
  18. 12 10月, 2011 1 次提交
  19. 05 10月, 2011 2 次提交
    • Y
      tcp: properly update lost_cnt_hint during shifting · 1e5289e1
      Yan, Zheng 提交于
      lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
      lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
      When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
      the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
      tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
      need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
      lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
      So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e5289e1
    • Y
      tcp: properly handle md5sig_pool references · 260fcbeb
      Yan, Zheng 提交于
      tcp_v4_clear_md5_list() assumes that multiple tcp md5sig peers
      only hold one reference to md5sig_pool. but tcp_v4_md5_do_add()
      increases use count of md5sig_pool for each peer. This patch
      makes tcp_v4_md5_do_add() only increases use count for the first
      tcp md5sig peer.
      Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      260fcbeb
  20. 04 10月, 2011 1 次提交