提交 · ca739481662137b8f717bc21f16719cda3c33d6b · openeuler / raspberrypi-kernel

02 6月, 2010 1 次提交

TCP: tcp_hybla: Fix integer overflow in slow start increment · edafe502

由 Daniele Lacamera 提交于 6月 02, 2010

For large values of rtt, 2^rho operation may overflow u32. Clamp down the increment to 2^16.
Signed-off-by: NDaniele Lacamera <root@danielinux.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edafe502

01 6月, 2010 2 次提交

net/ipv4/tcp_input.c: fix compilation breakage when FASTRETRANS_DEBUG > 1 · 288fcee8

由 Joe Perches 提交于 5月 31, 2010

Commit: c720c7e8 missed these.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288fcee8

net: sock_queue_err_skb() dont mess with sk_forward_alloc · b1faf566

由 Eric Dumazet 提交于 5月 31, 2010

Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.

Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.

Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1faf566

31 5月, 2010 1 次提交

netfilter: xtables: stackptr should be percpu · 7489aec8

由 Eric Dumazet 提交于 5月 31, 2010

commit f3c5c1bf (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)

Fix this using alloc_percpu()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-By: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

7489aec8

29 5月, 2010 1 次提交

net: fix sk_forward_alloc corruptions · 29030374

由 Eric Dumazet 提交于 5月 29, 2010

As David found out, sock_queue_err_skb() should be called with socket
lock hold, or we risk sk_forward_alloc corruption, since we use non
atomic operations to update this field.

This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots.
(BH already disabled)

1) skb_tstamp_tx() 
2) Before calling ip_icmp_error(), in __udp4_lib_err() 
3) Before calling ipv6_icmp_error(), in __udp6_lib_err()
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29030374

27 5月, 2010 1 次提交

net: fix lock_sock_bh/unlock_sock_bh · 8a74ad60

由 Eric Dumazet 提交于 5月 26, 2010

This new sock lock primitive was introduced to speedup some user context
socket manipulation. But it is unsafe to protect two threads, one using
regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh

This patch changes lock_sock_bh to be careful against 'owned' state.
If owned is found to be set, we must take the slow path.
lock_sock_bh() now returns a boolean to say if the slow path was taken,
and this boolean is used at unlock_sock_bh time to call the appropriate
unlock function.

After this change, BH are either disabled or enabled during the
lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
so we rename these functions to lock_sock_fast()/unlock_sock_fast().
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a74ad60

26 5月, 2010 1 次提交

ipmr: off by one in __ipmr_fill_mroute() · ed0f160a

由 Dan Carpenter 提交于 5月 26, 2010

This fixes a smatch warning:
	net/ipv4/ipmr.c +1917 __ipmr_fill_mroute(12) error: buffer overflow
	'(mrt)->vif_table' 32 <= 32

The ipv6 version had the same issue.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed0f160a

25 5月, 2010 1 次提交

kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN · 4be929be

由 Alexey Dobriyan 提交于 5月 24, 2010

- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
  USHORT_MAX/SHORT_MAX/SHORT_MIN.

- Make SHRT_MIN of type s16, not int, for consistency.

[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NWANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4be929be

18 5月, 2010 8 次提交

net: Remove unnecessary returns from void function()s · 3fa21e07

由 Joe Perches 提交于 5月 17, 2010

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fa21e07

net: Introduce skb_tunnel_rx() helper · d19d56dd

由 Eric Dumazet 提交于 5月 17, 2010

skb rxhash should be cleared when a skb is handled by a tunnel before
being delivered again, so that correct packet steering can take place.

There are other cleanups and accounting that we can factorize in a new
helper, skb_tunnel_rx()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d19d56dd

tcp: tcp_synack_options() fix · de213e5e

由 Eric Dumazet 提交于 5月 17, 2010

Commit 33ad798c (tcp: options clean up) introduced a problem
if MD5+SACK+timestamps were used in initial SYN message.

Some stacks (old linux for example) try to negotiate MD5+SACK+TSTAMP
sessions, but since 40 bytes of tcp options space are not enough to
store all the bits needed, we chose to disable timestamps in this case.

We send a SYN-ACK _without_ timestamp option, but socket has timestamps
enabled and all further outgoing messages contain a TS block, all with
the initial timestamp of the remote peer.

Fix is to really disable timestamps option for the whole session.
Reported-by: NBijay Singh <Bijay.Singh@guavus.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de213e5e

net: Remove unnecessary semicolons after switch statements · ccbd6a5a

由 Joe Perches 提交于 5月 14, 2010

Also added an explicit break; to avoid
a fallthrough in net/ipv4/tcp_input.c
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccbd6a5a

net: No dst refcounting in ip_queue_xmit() · ab6e3feb

由 Eric Dumazet 提交于 5月 10, 2010

TCP outgoing packets can avoid two atomic ops, and dirtying
of previously higly contended cache line using new refdst
infrastructure.

Note 1: loopback device excluded because of !IFF_XMIT_DST_RELEASE
Note 2: UDP packets dsts are built before ip_queue_xmit().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab6e3feb

net: Use ip_route_input_noref() in input path · 4a94445c

由 Eric Dumazet 提交于 5月 10, 2010

Use ip_route_input_noref() in ip fast path, to avoid two atomic ops per
incoming packet.

Note: loopback is excluded from this optimization in ip_rcv_finish()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a94445c

net: implements ip_route_input_noref() · 407eadd9

由 Eric Dumazet 提交于 5月 10, 2010

ip_route_input() is the version returning a refcounted dst, while
ip_route_input_noref() returns a non refcounted one.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

407eadd9

net: add a noref bit on skb dst · 7fee226a

由 Eric Dumazet 提交于 5月 11, 2010

Use low order bit of skb->_skb_dst to tell dst is not refcounted.

Change _skb_dst to _skb_refdst to make sure all uses are catched.

skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.

New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)

skb_dst_drop() drops a reference only if skb dst was refcounted.

skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.

Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().

Use skb_dst_force() in dev_requeue_skb().

Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fee226a

16 5月, 2010 3 次提交

net: Introduce sk_route_nocaps · a465419b

由 Eric Dumazet 提交于 5月 16, 2010

TCP-MD5 sessions have intermittent failures, when route cache is
invalidated. ip_queue_xmit() has to find a new route, calls
sk_setup_caps(sk, &rt->u.dst), destroying the 

sk->sk_route_caps &= ~NETIF_F_GSO_MASK

that MD5 desperately try to make all over its way (from
tcp_transmit_skb() for example)

So we send few bad packets, and everything is fine when
tcp_transmit_skb() is called again for this socket.

Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.
Reported-by: NBhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a465419b

tcp: fix MD5 (RFC2385) support · 35790c04

由 Eric Dumazet 提交于 5月 16, 2010

TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.

We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.

Fix is to disable preemption and BH.
Reported-by: NBhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35790c04

net: reserve ports for applications using fixed port numbers · e3826f1e

由 Amerigo Wang 提交于 5月 05, 2010

(Dropped the infiniband part, because Tetsuo modified the related code,
I will send a separate patch for it once this is accepted.)

This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
allows users to reserve ports for third-party applications.

The reserved ports will not be used by automatic port assignments
(e.g. when calling connect() or bind() with port number 0). Explicit
port allocation behavior is unchanged.
Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com>
Signed-off-by: NWANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3826f1e

13 5月, 2010 3 次提交

netfilter: remove unnecessary returns from void function()s · 736d58e3

由 Joe Perches 提交于 5月 13, 2010

This patch removes from net/ netfilter files
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'
Signed-off-by: NJoe Perches <joe@perches.com>
[Patrick: changed to keep return statements in otherwise empty function bodies]
Signed-off-by: NPatrick McHardy <kaber@trash.net>

736d58e3

netfilter: cleanup printk messages · 654d0fbd

由 Stephen Hemminger 提交于 5月 13, 2010

Make sure all printk messages have a severity level.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

654d0fbd

netfilter: change NF_ASSERT to WARN_ON · af567603

由 Stephen Hemminger 提交于 5月 13, 2010

Change netfilter asserts to standard WARN_ON. This has the
benefit of backtrace info and also causes netfilter errors
to show up on kerneloops.org.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

af567603

12 5月, 2010 5 次提交

netfilter: xtables: combine built-in extension structs · 4538506b

由 Jan Engelhardt 提交于 7月 04, 2009

Prepare the arrays for use with the multiregister function. The
future layer-3 xt matches can then be easily added to it without
needing more (un)register code.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

4538506b

netfilter: xtables: change hotdrop pointer to direct modification · b4ba2611

由 Jan Engelhardt 提交于 7月 07, 2009

Since xt_action_param is writable, let's use it. The pointer to
'bool hotdrop' always worried (8 bytes (64-bit) to write 1 byte!).
Surprisingly results in a reduction in size:

   text    data     bss filename
5457066  692730  357892 vmlinux.o-prev
5456554  692730  357892 vmlinux.o
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

b4ba2611

netfilter: xtables: deconstify struct xt_action_param for matches · 62fc8051

由 Jan Engelhardt 提交于 7月 07, 2009

In future, layer-3 matches will be an xt module of their own, and
need to set the fragoff and thoff fields. Adding more pointers would
needlessy increase memory requirements (esp. so for 64-bit, where
pointers are wider).
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

62fc8051

J
netfilter: xtables: substitute temporary defines by final name · 4b560b44
由 Jan Engelhardt 提交于 7月 05, 2009
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
4b560b44

netfilter: xtables: combine struct xt_match_param and xt_target_param · de74c169

由 Jan Engelhardt 提交于 7月 05, 2009

The structures carried - besides match/target - almost the same data.
It is possible to combine them, as extensions are evaluated serially,
and so, the callers end up a little smaller.

  text  data  bss  filename
-15318   740  104  net/ipv4/netfilter/ip_tables.o
+15286   740  104  net/ipv4/netfilter/ip_tables.o
-15333   540  152  net/ipv6/netfilter/ip6_tables.o
+15269   540  152  net/ipv6/netfilter/ip6_tables.o
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>

de74c169

10 5月, 2010 2 次提交

D
net: Fix FDDI and TR config checks in ipv4 arp and LLC. · f0ecde14
由 David S. Miller 提交于 5月 10, 2010
```
Need to check both CONFIG_FOO and CONFIG_FOO_MODULE
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f0ecde14

IPv4: unresolved multicast route cleanup · bbd72543

由 Andreas Meissner 提交于 5月 10, 2010

Fixes the expiration timer for unresolved multicast route entries.
In case new multicast routing requests come in faster than the
expiration timeout occurs (e.g. zap through multicast TV streams), the
timer is prevented from being called at time for already existing entries.

As the single timer is resetted to default whenever a new entry is made,
the timeout for existing unresolved entires are missed and/or not
updated. As a consequence new requests are denied when the limit of
unresolved entries has been reached because old entries live longer than
they are supposed to.

The solution is to reset the timer only for the first unresolved entry
in the multicast routing cache. All other timers are already set and
updated correctly within the timer function itself by now.

Signed-off by: Andreas Meissner <andreas.meissner@sphairon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bbd72543

08 5月, 2010 1 次提交

ipv4: remove ip_rt_secret timer (v4) · 3ee94372

由 Neil Horman 提交于 5月 08, 2010

A while back there was a discussion regarding the rt_secret_interval timer.
Given that we've had the ability to do emergency route cache rebuilds for awhile
now, based on a statistical analysis of the various hash chain lengths in the
cache, the use of the flush timer is somewhat redundant. This patch removes the
rt_secret_interval sysctl, allowing us to rely solely on the statistical
analysis mechanism to determine the need for route cache flushes.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ee94372

07 5月, 2010 1 次提交

ipv4: udp: fix short packet and bad checksum logging · ccc2d97c

由 Bjørn Mork 提交于 5月 06, 2010

commit 2783ef23 moved the initialisation of saddr and daddr after
pskb_may_pull() to avoid a potential data corruption.  Unfortunately
also placing it after the short packet and bad checksum error paths,
where these variables are used for logging.  The result is bogus
output like

[92238.389505] UDP: short packet: From 2.0.0.0:65535 23715/178 to 0.0.0.0:65535

Moving the saddr and daddr initialisation above the error paths, while still
keeping it after the pskb_may_pull() to keep the fix from commit 2783ef23.
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Cc: stable@kernel.org
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccc2d97c

02 5月, 2010 2 次提交
- J
  netfilter: xtables: dissolve do_match function · ef53d702
  由 Jan Engelhardt 提交于 7月 09, 2009
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  ef53d702
- J
  netfilter: ip_tables: fix compilation when debug is enabled · b5cad0df
  由 Jan Engelhardt 提交于 5月 02, 2010
```
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
```
  b5cad0df
29 4月, 2010 3 次提交

net: ip_queue_rcv_skb() helper · f84af32c

由 Eric Dumazet 提交于 4月 28, 2010

When queueing a skb to socket, we can immediately release its dst if
target socket do not use IP_CMSG_PKTINFO.

tcp_data_queue() can drop dst too.

This to benefit from a hot cache line and avoid the receiver, possibly
on another cpu, to dirty this cache line himself.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f84af32c

net: speedup udp receive path · 4b0b72f7

由 Eric Dumazet 提交于 4月 28, 2010

Since commit 95766fff ([UDP]: Add memory accounting.), 
each received packet needs one extra sock_lock()/sock_release() pair.

This added latency because of possible backlog handling. Then later,
ticket spinlocks added yet another latency source in case of DDOS.

This patch introduces lock_sock_bh() and unlock_sock_bh()
synchronization primitives, avoiding one atomic operation and backlog
processing.

skb_free_datagram_locked() uses them instead of full blown
lock_sock()/release_sock(). skb is orphaned inside locked section for
proper socket memory reclaim, and finally freed outside of it.

UDP receive path now take the socket spinlock only once.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b0b72f7

Revert "tcp: bind() fix when many ports are bound" · 8d238b25

由 David S. Miller 提交于 4月 28, 2010

This reverts two commits:

fda48a0d
tcp: bind() fix when many ports are bound

and a follow-on fix for it:

6443bb1f
ipv6: Fix inet6_csk_bind_conflict()

It causes problems with binding listening sockets when time-wait
sockets from a previous instance still are alive.

It's too late to keep fiddling with this so late in the -rc
series, and we'll deal with it in net-next-2.6 instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d238b25

28 4月, 2010 3 次提交

net: sk_add_backlog() take rmem_alloc into account · c377411f

由 Eric Dumazet 提交于 4月 27, 2010

Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.

We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.

Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.

Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c377411f

net: Make RFS socket operations not be inet specific. · c58dc01b

由 David S. Miller 提交于 4月 27, 2010

Idea from Eric Dumazet.

As for placement inside of struct sock, I tried to choose a place
that otherwise has a 32-bit hole on 64-bit systems.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>

c58dc01b

TCP: avoid to send keepalive probes if receiving data · 6c37e5de

由 Flavio Leitner 提交于 4月 26, 2010

RFC 1122 says the following:
...
  Keep-alive packets MUST only be sent when no data or
  acknowledgement packets have been received for the
  connection within an interval.
...

The acknowledgement packet is reseting the keepalive
timer but the data packet isn't. This patch fixes it by
checking the timestamp of the last received data packet
too when the keepalive timer expires.
Signed-off-by: NFlavio Leitner <fleitner@redhat.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c37e5de

26 4月, 2010 1 次提交

net: ipmr: add support for dumping routing tables over netlink · cb6a4e46

由 Patrick McHardy 提交于 4月 26, 2010

The ipmr /proc interface (ip_mr_cache) can't be extended to dump routes
from any tables but the main table in a backwards compatible fashion since
the output format ends in a variable amount of output interfaces.

Introduce a new netlink interface to dump multicast routes from all tables,
similar to the netlink interface for regular routes.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

cb6a4e46