1. 10 12月, 2011 1 次提交
  2. 07 12月, 2011 11 次提交
  3. 06 12月, 2011 5 次提交
  4. 05 12月, 2011 2 次提交
    • E
      tcp: tcp_sendmsg() page recycling · 761965ea
      Eric Dumazet 提交于
      If our TCP_PAGE(sk) is not shared (page_count() == 1), we can set page
      offset to 0.
      
      This permits better filling of the pages on small to medium tcp writes.
      
      "tbench 16" results on my dev server (2x4x2 machine) :
      
      Before : 3072 MB/s
      After  : 3146 MB/s  (2.4 % gain)
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      761965ea
    • E
      tcp: take care of misalignments · 117632e6
      Eric Dumazet 提交于
      We discovered that TCP stack could retransmit misaligned skbs if a
      malicious peer acknowledged sub MSS frame. This currently can happen
      only if output interface is non SG enabled : If SG is enabled, tcp
      builds headless skbs (all payload is included in fragments), so the tcp
      trimming process only removes parts of skb fragments, header stay
      aligned.
      
      Some arches cant handle misalignments, so force a head reallocation and
      shrink headroom to MAX_TCP_HEADER.
      
      Dont care about misaligments on x86 and PPC (or other arches setting
      NET_IP_ALIGN to 0)
      
      This patch introduces __pskb_copy() which can specify the headroom of
      new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      117632e6
  5. 04 12月, 2011 1 次提交
  6. 03 12月, 2011 1 次提交
  7. 02 12月, 2011 4 次提交
  8. 01 12月, 2011 5 次提交
  9. 30 11月, 2011 2 次提交
    • R
      ipv4: remove useless codes in ipmr_device_event() · e92036a6
      RongQing.Li 提交于
      Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
      distinguish whether to unregister the device.
      
      When notify=1 means we does not need to unregister the device,
      so calling unregister_netdevice_many is useless.
      Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e92036a6
    • E
      tcp: avoid frag allocation for small frames · f07d960d
      Eric Dumazet 提交于
      tcp_sendmsg() uses select_size() helper to choose skb head size when a
      new skb must be allocated.
      
      If GSO is enabled for the socket, current strategy is to force all
      payload data to be outside of headroom, in PAGE fragments.
      
      This strategy is not welcome for small packets, wasting memory.
      
      Experiments show that best results are obtained when using 2048 bytes
      for skb head (This includes the skb overhead and various headers)
      
      This patch provides better len/truesize ratios for packets sent to
      loopback device, and reduce memory needs for in-flight loopback packets,
      particularly on arches with big pages.
      
      If a sender sends many 1-byte packets to an unresponsive application,
      receiver rmem_alloc will grow faster and will stop queuing these packets
      sooner, or will collapse its receive queue to free excess memory.
      
      netperf -t TCP_RR results are improved by ~4 %, and many workloads are
      improved as well (tbench, mysql...)
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f07d960d
  10. 29 11月, 2011 3 次提交
    • N
      tcp: do not scale TSO segment size with reordering degree · 6b5a5c0d
      Neal Cardwell 提交于
      Since 2005 (c1b4a7e6)
      tcp_tso_should_defer has been using tcp_max_burst() as a target limit
      for deciding how large to make outgoing TSO packets when not using
      sysctl_tcp_tso_win_divisor. But since 2008
      (dd9e0dda) tcp_max_burst() returns the
      reordering degree. We should not have tcp_tso_should_defer attempt to
      build larger segments just because there is more reordering. This
      commit splits the notion of deferral size used in TSO from the notion
      of burst size used in cwnd moderation, and returns the TSO deferral
      limit to its original value.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b5a5c0d
    • E
      net: dont call jump_label_dec from irq context · b90e5794
      Eric Dumazet 提交于
      Igor Maravic reported an error caused by jump_label_dec() being called
      from IRQ context :
      
       BUG: sleeping function called from invalid context at kernel/mutex.c:271
       in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
       1 lock held by swapper/0:
        #0:  (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340
       Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
      Call Trace:
       <IRQ>  [<ffffffff8104f417>] __might_sleep+0x137/0x1f0
       [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370
       [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff8109a37f>] ? local_clock+0x6f/0x80
       [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0
       [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160
       [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
       [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80
       [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50
       [<ffffffff81566525>] net_disable_timestamp+0x15/0x20
       [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50
       [<ffffffff81557b00>] __sk_free+0x80/0x200
       [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70
       [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
       [<ffffffff81557cba>] sock_wfree+0x3a/0x70
       [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120
       [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30
       [<ffffffff8155c119>] kfree_skb+0x49/0x170
       [<ffffffff815e936e>] arp_error_report+0x3e/0x90
       [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0
       [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0
       [<ffffffff81578d20>] ? neigh_update+0x640/0x640
       [<ffffffff81073558>] __do_softirq+0xc8/0x3a0
      
      Since jump_label_{inc|dec} must be called from process context only,
      we must defer jump_label_dec() if net_disable_timestamp() is called
      from interrupt context.
      Reported-by: NIgor Maravic <igorm@etf.rs>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b90e5794
    • E
      tcp: tcp_sendmsg() wrong access to sk_route_caps · 690e99c4
      Eric Dumazet 提交于
      Now sk_route_caps is u64, its dangerous to use an integer to store
      result of an AND operator. It wont work if NETIF_F_SG is moved on the
      upper part of u64.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Michał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690e99c4
  11. 28 11月, 2011 5 次提交