1. 03 3月, 2015 1 次提交
  2. 02 3月, 2015 1 次提交
  3. 03 2月, 2015 1 次提交
    • W
      net-timestamp: no-payload only sysctl · b245be1f
      Willem de Bruijn 提交于
      Tx timestamps are looped onto the error queue on top of an skb. This
      mechanism leaks packet headers to processes unless the no-payload
      options SOF_TIMESTAMPING_OPT_TSONLY is set.
      
      Add a sysctl that optionally drops looped timestamp with data. This
      only affects processes without CAP_NET_RAW.
      
      The policy is checked when timestamps are generated in the stack.
      It is possible for timestamps with data to be reported after the
      sysctl is set, if these were queued internally earlier.
      
      No vulnerability is immediately known that exploits knowledge
      gleaned from packet headers, but it may still be preferable to allow
      administrators to lock down this path at the cost of possible
      breakage of legacy applications.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      
      ----
      
      Changes
        (v1 -> v2)
        - test socket CAP_NET_RAW instead of capable(CAP_NET_RAW)
        (rfc -> v1)
        - document the sysctl in Documentation/sysctl/net.txt
        - fix access control race: read .._OPT_TSONLY only once,
              use same value for permission check and skb generation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b245be1f
  4. 06 12月, 2014 1 次提交
    • A
      net: sock: allow eBPF programs to be attached to sockets · 89aa0758
      Alexei Starovoitov 提交于
      introduce new setsockopt() command:
      
      setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))
      
      where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
      and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER
      
      setsockopt() calls bpf_prog_get() which increments refcnt of the program,
      so it doesn't get unloaded while socket is using the program.
      
      The same eBPF program can be attached to multiple sockets.
      
      User task exit automatically closes socket which calls sk_filter_uncharge()
      which decrements refcnt of eBPF program
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89aa0758
  5. 25 11月, 2014 1 次提交
    • D
      crypto: algif - add and use sock_kzfree_s() instead of memzero_explicit() · 79e88659
      Daniel Borkmann 提交于
      Commit e1bd95bf ("crypto: algif - zeroize IV buffer") and
      2a6af25b ("crypto: algif - zeroize message digest buffer")
      added memzero_explicit() calls on buffers that are later on
      passed back to sock_kfree_s().
      
      This is a discussed follow-up that, instead, extends the sock
      API and adds sock_kzfree_s(), which internally uses kzfree()
      instead of kfree() for passing the buffers back to slab.
      
      Having sock_kzfree_s() allows to keep the changes more minimal
      by just having a drop-in replacement instead of adding
      memzero_explicit() calls everywhere before sock_kfree_s().
      
      In kzfree(), the compiler is not allowed to optimize the memset()
      away and thus there's no need for memzero_explicit(). Both,
      sock_kfree_s() and sock_kzfree_s() are wrappers for
      __sock_kfree_s() and call into kfree() resp. kzfree(); here,
      __sock_kfree_s() needs to be explicitly inlined as we want the
      compiler to optimize the call and condition away and thus it
      produces e.g. on x86_64 the _same_ assembler output for
      sock_kfree_s() before and after, and thus also allows for
      avoiding code duplication.
      
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      79e88659
  6. 12 11月, 2014 1 次提交
    • E
      net: introduce SO_INCOMING_CPU · 2c8c56e1
      Eric Dumazet 提交于
      Alternative to RPS/RFS is to use hardware support for multiple
      queues.
      
      Then split a set of million of sockets into worker threads, each
      one using epoll() to manage events on its own socket pool.
      
      Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
      know after accept() or connect() on which queue/cpu a socket is managed.
      
      We normally use one cpu per RX queue (IRQ smp_affinity being properly
      set), so remembering on socket structure which cpu delivered last packet
      is enough to solve the problem.
      
      After accept(), connect(), or even file descriptor passing around
      processes, applications can use :
      
       int cpu;
       socklen_t len = sizeof(cpu);
      
       getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
      
      And use this information to put the socket into the right silo
      for optimal performance, as all networking stack should run
      on the appropriate cpu, without need to send IPI (RPS/RFS).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c8c56e1
  7. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  8. 15 10月, 2014 1 次提交
  9. 28 9月, 2014 1 次提交
  10. 20 9月, 2014 1 次提交
  11. 09 9月, 2014 1 次提交
  12. 06 9月, 2014 3 次提交
  13. 02 9月, 2014 1 次提交
    • W
      sock: deduplicate errqueue dequeue · 364a9e93
      Willem de Bruijn 提交于
      sk->sk_error_queue is dequeued in four locations. All share the
      exact same logic. Deduplicate.
      
      Also collapse the two critical sections for dequeue (at the top of
      the recv handler) and signal (at the bottom).
      
      This moves signal generation for the next packet forward, which should
      be harmless.
      
      It also changes the behavior if the recv handler exits early with an
      error. Previously, a signal for follow-up packets on the errqueue
      would then not be scheduled. The new behavior, to always signal, is
      arguably a bug fix.
      
      For rxrpc, the change causes the same function to be called repeatedly
      for each queued packet (because the recv handler == sk_error_report).
      It is likely that all packets will fail for the same reason (e.g.,
      memory exhaustion).
      
      This code runs without sk_lock held, so it is not safe to trust that
      sk->sk_err is immutable inbetween releasing q->lock and the subsequent
      test. Introduce int err just to avoid this potential race.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      364a9e93
  14. 30 8月, 2014 1 次提交
    • E
      net: attempt a single high order allocation · d9b2938a
      Eric Dumazet 提交于
      In commit ed98df33 ("net: use __GFP_NORETRY for high order
      allocations") we tried to address one issue caused by order-3
      allocations.
      
      We still observe high latencies and system overhead in situations where
      compaction is not successful.
      
      Instead of trying order-3, order-2, and order-1, do a single order-3
      best effort and immediately fallback to plain order-0.
      
      This mimics slub strategy to fallback to slab min order if the high
      order allocation used for performance failed.
      
      Order-3 allocations give a performance boost only if they can be done
      without recurring and expensive memory scan.
      
      Quoting David :
      
      The page allocator relies on synchronous (sync light) memory compaction
      after direct reclaim for allocations that don't retry and deferred
      compaction doesn't work with this strategy because the allocation order
      is always decreasing from the previous failed attempt.
      
      This means sync light compaction will always be encountered if memory
      cannot be defragmented or reclaimed several times during the
      skb_page_frag_refill() iteration.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9b2938a
  15. 23 8月, 2014 1 次提交
  16. 06 8月, 2014 3 次提交
    • W
      net-timestamp: TCP timestamping · 4ed2d765
      Willem de Bruijn 提交于
      TCP timestamping extends SO_TIMESTAMPING to bytestreams.
      
      Bytestreams do not have a 1:1 relationship between send() buffers and
      network packets. The feature interprets a send call on a bytestream as
      a request for a timestamp for the last byte in that send() buffer.
      
      The choice corresponds to a request for a timestamp when all bytes in
      the buffer have been sent. That assumption depends on in-order kernel
      transmission. This is the common case. That said, it is possible to
      construct a traffic shaping tree that would result in reordering.
      The guarantee is strong, then, but not ironclad.
      
      This implementation supports send and sendpages (splice). GSO replaces
      one large packet with multiple smaller packets. This patch also copies
      the option into the correct smaller packet.
      
      This patch does not yet support timestamping on data in an initial TCP
      Fast Open SYN, because that takes a very different data path.
      
      If ID generation in ee_data is enabled, bytestream timestamps return a
      byte offset, instead of the packet counter for datagrams.
      
      The implementation supports a single timestamp per packet. It silenty
      replaces requests for previous timestamps. To avoid missing tstamps,
      flush the tcp queue by disabling Nagle, cork and autocork. Missing
      tstamps can be detected by offset when the ee_data ID is enabled.
      
      Implementation details:
      
      - On GSO, the timestamping code can be included in the main loop. I
      moved it into its own loop to reduce the impact on the common case
      to a single branch.
      
      - To avoid leaking the absolute seqno to userspace, the offset
      returned in ee_data must always be relative. It is an offset between
      an skb and sk field. The first is always set (also for GSO & ACK).
      The second must also never be uninitialized. Only allow the ID
      option on sockets in the ESTABLISHED state, for which the seqno
      is available. Never reset it to zero (instead, move it to the
      current seqno when reenabling the option).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ed2d765
    • W
      net-timestamp: add key to disambiguate concurrent datagrams · 09c2d251
      Willem de Bruijn 提交于
      Datagrams timestamped on transmission can coexist in the kernel stack
      and be reordered in packet scheduling. When reading looped datagrams
      from the socket error queue it is not always possible to unique
      correlate looped data with original send() call (for application
      level retransmits). Even if possible, it may be expensive and complex,
      requiring packet inspection.
      
      Introduce a data-independent ID mechanism to associate timestamps with
      send calls. Pass an ID alongside the timestamp in field ee_data of
      sock_extended_err.
      
      The ID is a simple 32 bit unsigned int that is associated with the
      socket and incremented on each send() call for which software tx
      timestamp generation is enabled.
      
      The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
      avoid changing ee_data for existing applications that expect it 0.
      The counter is reset each time the flag is reenabled. Reenabling
      does not change the ID of already submitted data. It is possible
      to receive out of order IDs if the timestamp stream is not quiesced
      first.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09c2d251
    • W
      net-timestamp: move timestamp flags out of sk_flags · b9f40e21
      Willem de Bruijn 提交于
      sk_flags is reaching its limit. New timestamping options will not fit.
      Move all of them into a new field sk->sk_tsflags.
      
      Added benefit is that this removes boilerplate code to convert between
      SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.
      
      SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
      timestamp logic (netstamp_needed). That can be simplified and this
      last key removed, but will leave that for a separate patch.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      
      ----
      
      The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
      though that scatters tstamp fields throughout the struct.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f40e21
  17. 03 8月, 2014 1 次提交
  18. 30 7月, 2014 1 次提交
    • W
      net: remove deprecated syststamp timestamp · 4d276eb6
      Willem de Bruijn 提交于
      The SO_TIMESTAMPING API defines three types of timestamps: software,
      hardware in raw format (hwtstamp) and hardware converted to system
      format (syststamp). The last has been deprecated in favor of combining
      hwtstamp with a PTP clock driver. There are no active users in the
      kernel.
      
      The option was device driver dependent. If set, but without hardware
      support, the correct behavior is to return zero in the relevant field
      in the SCM_TIMESTAMPING ancillary message. Without device drivers
      implementing the option, this field is effectively always zero.
      
      Remove the internal plumbing to dissuage new drivers from implementing
      the feature. Keep the SOF_TIMESTAMPING_SYS_HARDWARE flag, however, to
      avoid breaking existing applications that request the timestamp.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d276eb6
  19. 24 7月, 2014 1 次提交
  20. 24 5月, 2014 1 次提交
  21. 25 4月, 2014 1 次提交
  22. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  23. 12 3月, 2014 1 次提交
    • E
      tcp: tcp_release_cb() should release socket ownership · c3f9b018
      Eric Dumazet 提交于
      Lars Persson reported following deadlock :
      
      -000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
      -001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
      -002 |ip_local_deliver_finish(skb = 0x8BD527A0)
      -003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
      -004 |netif_receive_skb(skb = 0x8BD527A0)
      -005 |elk_poll(napi = 0x8C770500, budget = 64)
      -006 |net_rx_action(?)
      -007 |__do_softirq()
      -008 |do_softirq()
      -009 |local_bh_enable()
      -010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
      -011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
      -012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
      -013 |tcp_release_cb(sk = 0x8BE6B2A0)
      -014 |release_sock(sk = 0x8BE6B2A0)
      -015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
      -016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
      -017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
      -018 |smb_send_kvec()
      -019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
      -020 |cifs_call_async()
      -021 |cifs_async_writev(wdata = 0x87FD6580)
      -022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
      -023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
      -024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
      -025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
      -026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
      -027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
      -028 |bdi_writeback_workfn(work = 0x87E4A9CC)
      -029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
      -030 |worker_thread(__worker = 0x8B045880)
      -031 |kthread(_create = 0x87CADD90)
      -032 |ret_from_kernel_thread(asm)
      
      Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
      it is running from softirq context.
      
      Lars trace involved a NIC without RX checksum support but other points
      are problematic as well, like the prequeue stuff.
      
      Problem is triggered by a timer, that found socket being owned by user.
      
      tcp_release_cb() should call tcp_write_timer_handler() or
      tcp_delack_timer_handler() in the appropriate context :
      
      BH disabled and socket lock held, but 'owned' field cleared,
      as if they were running from timer handlers.
      
      Fixes: 6f458dfb ("tcp: improve latencies of timer triggered events")
      Reported-by: NLars Persson <lars.persson@axis.com>
      Tested-by: NLars Persson <lars.persson@axis.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3f9b018
  24. 07 2月, 2014 1 次提交
    • E
      net: use __GFP_NORETRY for high order allocations · ed98df33
      Eric Dumazet 提交于
      sock_alloc_send_pskb() & sk_page_frag_refill()
      have a loop trying high order allocations to prepare
      skb with low number of fragments as this increases performance.
      
      Problem is that under memory pressure/fragmentation, this can
      trigger OOM while the intent was only to try the high order
      allocations, then fallback to order-0 allocations.
      
      We had various reports from unexpected regressions.
      
      According to David, setting __GFP_NORETRY should be fine,
      as the asynchronous compaction is still enabled, and this
      will prevent OOM from kicking as in :
      
      CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
      oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
      CFSClientEventm
      
      Call Trace:
       [<ffffffff8043766c>] dump_header+0xe1/0x23e
       [<ffffffff80437a02>] oom_kill_process+0x6a/0x323
       [<ffffffff80438443>] out_of_memory+0x4b3/0x50d
       [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
       [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
       [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
       [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
       [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
       [<ffffffff802a5037>] inet_sendmsg+0x67/0x100
       [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
       [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
       [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430
       [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110
       [<ffffffff80711472>] system_call_fastpath+0x16/0x1b
      Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed98df33
  25. 19 1月, 2014 1 次提交
    • M
      net: introduce SO_BPF_EXTENSIONS · ea02f941
      Michal Sekletar 提交于
      For user space packet capturing libraries such as libpcap, there's
      currently only one way to check which BPF extensions are supported
      by the kernel, that is, commit aa1113d9 ("net: filter: return
      -EINVAL if BPF_S_ANC* operation is not supported"). For querying all
      extensions at once this might be rather inconvenient.
      
      Therefore, this patch introduces a new option which can be used as
      an argument for getsockopt(), and allows one to obtain information
      about which BPF extensions are supported by the current kernel.
      
      As David Miller suggests, we do not need to define any bits right
      now and status quo can just return 0 in order to state that this
      versions supports SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Later
      additions to BPF extensions need to add their bits to the
      bpf_tell_extensions() function, as documented in the comment.
      Signed-off-by: NMichal Sekletar <msekleta@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Reviewed-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea02f941
  26. 17 1月, 2014 1 次提交
  27. 04 1月, 2014 3 次提交
  28. 11 12月, 2013 1 次提交
  29. 23 10月, 2013 1 次提交
  30. 18 10月, 2013 1 次提交
    • E
      net: refactor sk_page_frag_refill() · 400dfd3a
      Eric Dumazet 提交于
      While working on virtio_net new allocation strategy to increase
      payload/truesize ratio, we found that refactoring sk_page_frag_refill()
      was needed.
      
      This patch splits sk_page_frag_refill() into two parts, adding
      skb_page_frag_refill() which can be used without a socket.
      
      While we are at it, add a minimum frag size of 32 for
      sk_page_frag_refill()
      
      Michael will either use netdev_alloc_frag() from softirq context,
      or skb_page_frag_refill() from process context in refill_work()
       (GFP_KERNEL allocations)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      400dfd3a
  31. 09 10月, 2013 1 次提交
  32. 29 9月, 2013 1 次提交
    • E
      net: introduce SO_MAX_PACING_RATE · 62748f32
      Eric Dumazet 提交于
      As mentioned in commit afe4fd06 ("pkt_sched: fq: Fair Queue packet
      scheduler"), this patch adds a new socket option.
      
      SO_MAX_PACING_RATE offers the application the ability to cap the
      rate computed by transport layer. Value is in bytes per second.
      
      u32 val = 1000000;
      setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
      
      To be effectively paced, a flow must use FQ packet scheduler.
      
      Note that a packet scheduler takes into account the headers for its
      computations. The effective payload rate depends on MSS and retransmits
      if any.
      
      I chose to make this pacing rate a SOL_SOCKET option instead of a
      TCP one because this can be used by other protocols.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62748f32
  33. 10 8月, 2013 1 次提交
    • E
      net: attempt high order allocations in sock_alloc_send_pskb() · 28d64271
      Eric Dumazet 提交于
      Adding paged frags skbs to af_unix sockets introduced a performance
      regression on large sends because of additional page allocations, even
      if each skb could carry at least 100% more payload than before.
      
      We can instruct sock_alloc_send_pskb() to attempt high order
      allocations.
      
      Most of the time, it does a single page allocation instead of 8.
      
      I added an additional parameter to sock_alloc_send_pskb() to
      let other users to opt-in for this new feature on followup patches.
      
      Tested:
      
      Before patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    46861.15
      
      After patch :
      
      $ netperf -t STREAM_STREAM
      STREAM STREAM TEST
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       2304  212992  212992    10.00    57981.11
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28d64271
  34. 02 8月, 2013 1 次提交