提交 · 7f1fb60c4fc9fb29fbb406ac8c4cfb4e59e168d6 · openeuler / Kernel

07 12月, 2011 1 次提交

inet_diag: Partly rename inet_ to sock_ · 7f1fb60c

由 Pavel Emelyanov 提交于 12月 06, 2011

The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f1fb60c

06 12月, 2011 4 次提交

ipv4: arp: Cleanup in arp.c · 40e4783e

由 Igor Maravic 提交于 12月 05, 2011

Use "IS_ENABLED(CONFIG_FOO)" macro instead of
"defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)"
Signed-off-by: NIgor Maravic <igorm@etf.rs>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40e4783e

tcp: remove TCP_OFF and TCP_PAGE macros · 0a5912db

由 Eric Dumazet 提交于 12月 05, 2011

As mentioned by Joe Perches, TCP_OFF() and TCP_PAGE() macros are
useless.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a5912db

tcp: fix tcp_trim_head() · 4fa48bf3

由 Eric Dumazet 提交于 12月 04, 2011

commit f07d960d (tcp: avoid frag allocation for small frames)
breaked assumption in tcp stack that skb is either linear (skb->data_len
== 0), or fully fragged (skb->data_len == skb->len)

tcp_trim_head() made this assumption, we must fix it.

Thanks to Vijay for providing a very detailed explanation.
Reported-by: NVijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fa48bf3

net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. · 27217455

由 David Miller 提交于 12月 02, 2011

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

27217455

05 12月, 2011 2 次提交

tcp: tcp_sendmsg() page recycling · 761965ea

由 Eric Dumazet 提交于 12月 04, 2011

If our TCP_PAGE(sk) is not shared (page_count() == 1), we can set page
offset to 0.

This permits better filling of the pages on small to medium tcp writes.

"tbench 16" results on my dev server (2x4x2 machine) :

Before : 3072 MB/s
After  : 3146 MB/s  (2.4 % gain)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

761965ea

tcp: take care of misalignments · 117632e6

由 Eric Dumazet 提交于 12月 03, 2011

We discovered that TCP stack could retransmit misaligned skbs if a
malicious peer acknowledged sub MSS frame. This currently can happen
only if output interface is non SG enabled : If SG is enabled, tcp
builds headless skbs (all payload is included in fragments), so the tcp
trimming process only removes parts of skb fragments, header stay
aligned.

Some arches cant handle misalignments, so force a head reallocation and
shrink headroom to MAX_TCP_HEADER.

Dont care about misaligments on x86 and PPC (or other arches setting
NET_IP_ALIGN to 0)

This patch introduces __pskb_copy() which can specify the headroom of
new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117632e6

04 12月, 2011 1 次提交

tcp: drop SYN+FIN messages · fdf5af0d

由 Eric Dumazet 提交于 12月 02, 2011

Denys Fedoryshchenko reported that SYN+FIN attacks were bringing his
linux machines to their limits.

Dont call conn_request() if the TCP flags includes SYN flag
Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdf5af0d

02 12月, 2011 4 次提交

ipv4: flush route cache after change accept_local · d01ff0a0

由 Peter Pan(潘卫平) 提交于 12月 01, 2011

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.
Signed-off-by: NWeiping Pan <panweiping3@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d01ff0a0

Revert "udp: remove redundant variable" · 59c2cdae

由 David S. Miller 提交于 12月 01, 2011

This reverts commit 81d54ec8.

If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated.  So we won't compute
the same values as the original code did.
Reported-by: Npaul bilke <fsmail@conspiracy.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59c2cdae

ipv4: Perform peer validation on cached route lookup. · efbc368d

由 David S. Miller 提交于 12月 01, 2011

Otherwise we won't notice the peer GENID change.
Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efbc368d

ipv4: use a 64bit load/store in output path · 84f9307c

由 Eric Dumazet 提交于 11月 30, 2011

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84f9307c

01 12月, 2011 5 次提交

ipv4 : igmp : Delete useless parameter in ip_mc_add1_src() · 5eb81e89

由 Jun Zhao 提交于 11月 30, 2011

Need not to used 'delta' flag when add single-source to interface
filter source list.
Signed-off-by: NJun Zhao <mypopydev@gmail.com>
Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>

5eb81e89

atm: clip: Use device neigh support on top of "arp_tbl". · 32092ecf

由 David Miller 提交于 7月 25, 2011

Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32092ecf

neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables. · 76cc714e

由 David Miller 提交于 7月 25, 2011

Let the core self-size the neigh entry based upon the key length.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76cc714e

ipv4: fix lockdep splat in rt_cache_seq_show · 218fa90f

由 Eric Dumazet 提交于 11月 29, 2011

After commit f2c31e32 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.
Reported-by: NMiles Lane <miles.lane@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

218fa90f

tcp: inherit listener congestion control for passive cnx · d8a6e65f

由 Eric Dumazet 提交于 11月 30, 2011

Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.
Reported-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8a6e65f

30 11月, 2011 2 次提交

ipv4: remove useless codes in ipmr_device_event() · e92036a6

由 RongQing.Li 提交于 11月 23, 2011

Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
distinguish whether to unregister the device.

When notify=1 means we does not need to unregister the device,
so calling unregister_netdevice_many is useless.
Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e92036a6

tcp: avoid frag allocation for small frames · f07d960d

由 Eric Dumazet 提交于 11月 28, 2011

tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.

If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.

This strategy is not welcome for small packets, wasting memory.

Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)

This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.

If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.

netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f07d960d

29 11月, 2011 3 次提交

tcp: do not scale TSO segment size with reordering degree · 6b5a5c0d

由 Neal Cardwell 提交于 11月 21, 2011

Since 2005 (c1b4a7e6)
tcp_tso_should_defer has been using tcp_max_burst() as a target limit
for deciding how large to make outgoing TSO packets when not using
sysctl_tcp_tso_win_divisor. But since 2008
(dd9e0dda) tcp_max_burst() returns the
reordering degree. We should not have tcp_tso_should_defer attempt to
build larger segments just because there is more reordering. This
commit splits the notion of deferral size used in TSO from the notion
of burst size used in cwnd moderation, and returns the TSO deferral
limit to its original value.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b5a5c0d

net: dont call jump_label_dec from irq context · b90e5794

由 Eric Dumazet 提交于 11月 28, 2011

Igor Maravic reported an error caused by jump_label_dec() being called
from IRQ context :

 BUG: sleeping function called from invalid context at kernel/mutex.c:271
 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
 1 lock held by swapper/0:
  #0:  (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340
 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
Call Trace:
 <IRQ>  [<ffffffff8104f417>] __might_sleep+0x137/0x1f0
 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370
 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80
 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0
 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80
 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50
 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20
 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50
 [<ffffffff81557b00>] __sk_free+0x80/0x200
 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff81557cba>] sock_wfree+0x3a/0x70
 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120
 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30
 [<ffffffff8155c119>] kfree_skb+0x49/0x170
 [<ffffffff815e936e>] arp_error_report+0x3e/0x90
 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0
 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0
 [<ffffffff81578d20>] ? neigh_update+0x640/0x640
 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0

Since jump_label_{inc|dec} must be called from process context only,
we must defer jump_label_dec() if net_disable_timestamp() is called
from interrupt context.
Reported-by: NIgor Maravic <igorm@etf.rs>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b90e5794

tcp: tcp_sendmsg() wrong access to sk_route_caps · 690e99c4

由 Eric Dumazet 提交于 11月 28, 2011

Now sk_route_caps is u64, its dangerous to use an integer to store
result of an AND operator. It wont work if NETIF_F_SG is moved on the
upper part of u64.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

690e99c4

28 11月, 2011 5 次提交

tcp: skip cwnd moderation in TCP_CA_Open in tcp_try_to_open · 8cd6d616