提交 · 0c54b85f2828128274f319a1eb3ce7f604fe2a53 · openeuler / raspberrypi-kernel

16 3月, 2009 5 次提交

tcp: simplify tcp_current_mss · 0c54b85f

由 Ilpo Järvinen 提交于 3月 14, 2009

There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).

Bring xmit_goal_size parts into tcp.c
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c54b85f

tcp: don't check mtu probe completion in the loop · 72211e90

由 Ilpo Järvinen 提交于 3月 14, 2009

It seems that no variables clash such that we couldn't do
the check just once later on. Therefore move it.

Also kill dead obvious comment, dead argument and add
unlikely since this mtu probe does not happen too often.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72211e90

tcp: consolidate paws check · c887e6d2

由 Ilpo Järvinen 提交于 3月 14, 2009

Wow, it was quite tricky to merge that stream of negations
but I think I finally got it right:

check & replace_ts_recent:
(s32)(rcv_tsval - ts_recent) >= 0                  => 0
(s32)(ts_recent - rcv_tsval) <= 0                  => 0

discard:
(s32)(ts_recent - rcv_tsval)  > TCP_PAWS_WINDOW    => 1
(s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW    => 0

I toggled the return values of tcp_paws_check around since
the old encoding added yet-another negation making tracking
of truth-values really complicated.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c887e6d2

tcp: kill dead end_seq variable in clean_rtx_queue · c43d558a

由 Ilpo Järvinen 提交于 3月 14, 2009

I've already forgotten what for this was necessary, anyway
it's no longer used (if it ever was).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c43d558a

tcp: remove pointless .dsack/.num_sacks code · 5861f8e5

由 Ilpo Järvinen 提交于 3月 14, 2009

In the pure assignment case, the earlier zeroing is
still in effect.

David S. Miller raised concerns if the ifs are there to avoid
dirtying cachelines. I came to these conclusions:

> We'll be dirty it anyway (now that I check), the first "real" statement
> in tcp_rcv_established is:
>
>       tp->rx_opt.saw_tstamp = 0;
>
> ...that'll land on the same dword. :-/
>
> I suppose the blocks are there just because they had more complexity
> inside when they had to calculate the eff_sacks too (maybe it would
> have been better to just remove them in that drop-patch so you would
> have had less head-ache :-)).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5861f8e5

14 3月, 2009 3 次提交

tcp: '< 0' test on unsigned · a2025b8b

由 Roel Kluin 提交于 3月 13, 2009

promote 'cnt' to size_t, to match 'len'.
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2025b8b

ipv4: arp announce, arp_proxy and windows ip conflict verification · 73ce7b01

由 Denys Fedoryshchenko 提交于 3月 13, 2009

Windows (XP at least) hosts on boot, with configured static ip, performing 
address conflict detection, which is defined in RFC3927.
Here is quote of important information:

"
An ARP announcement is identical to the ARP Probe described above, 
except    that now the sender and target IP addresses are both set 
to the host's newly selected IPv4 address. 
"

But it same time this goes wrong with RFC5227.
"
The 'sender IP address' field MUST be set to all zeroes; this is to avoid
polluting ARP caches in other hosts on the same link in the case
where the address turns out to be already in use by another host.
"

When ARP proxy configured, it must not answer to both cases, because 
it is address conflict verification in any case. For Windows it is just 
causing to detect false "ip conflict". Already there is code for RFC5227, so 
just trivially we just check also if source ip == target ip.
Signed-off-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73ce7b01

Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying... · ead2ceb0

由 Neil Horman 提交于 3月 11, 2009

Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line points for skbs
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>

 include/linux/skbuff.h |    4 +++-
 net/core/datagram.c    |    2 +-
 net/core/skbuff.c      |   22 ++++++++++++++++++++++
 net/ipv4/arp.c         |    2 +-
 net/ipv4/udp.c         |    2 +-
 net/packet/af_packet.c |    2 +-
 6 files changed, 29 insertions(+), 5 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ead2ceb0

12 3月, 2009 1 次提交

tcp: allow timestamps even if SYN packet has tsval=0 · fc1ad92d

由 Eric Dumazet 提交于 3月 11, 2009

Some systems send SYN packets with apparently wrong RFC1323 timestamp
option values [timestamp tsval=0 tsecr=0].
It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml )

Linux TCP stack ignores this option and sends back a SYN+ACK packet
without timestamp option, thus many TCP flows cannot use timestamps
and lose some benefit of RFC1323.

Other operating systems seem to not care about initial tsval value, and let
tcp flows to negotiate timestamp option.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc1ad92d

10 3月, 2009 1 次提交

net: convert usage of packet_type to read_mostly · 7546dd97

由 Stephen Hemminger 提交于 3月 09, 2009

Protocols that use packet_type can be __read_mostly section for better
locality. Elminate any unnecessary initializations of NULL.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7546dd97

03 3月, 2009 3 次提交

tcp: Like icmp use register_pernet_subsys · 2f20d2e6

由 Eric W. Biederman 提交于 2月 22, 2009

To remove the possibility of packets flying around when network
devices are being cleaned up use reisger_pernet_subsys instead of
register_pernet_device.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f20d2e6

netns: Fix icmp shutdown. · 6eb07772

由 Eric W. Biederman 提交于 2月 22, 2009

Recently I had a kernel panic in icmp_send during a network namespace
cleanup.  There were packets in the arp queue that failed to be sent
and we attempted to generate an ICMP host unreachable message, but
failed because icmp_sk_exit had already been called.

The network devices are removed from a network namespace and their
arp queues are flushed before we do attempt to shutdown subsystems
so this error should have been impossible.

It turns out icmp_init is using register_pernet_device instead
of register_pernet_subsys.  Which resulted in icmp being shut down
while we still had the possibility of packets in flight, making
a nasty NULL pointer deference in interrupt context possible.

Changing this to register_pernet_subsys fixes the problem in
my testing.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eb07772

tcp: tcp_init_wl / tcp_update_wl argument cleanup · ee7537b6

由 Hantzis Fotis 提交于 3月 02, 2009

The above functions from include/net/tcp.h have been defined with an
argument that they never use. The argument is 'u32 ack' which is never
used inside the function body, and thus it can be removed. The rest of
the patch involves the necessary changes to the function callers of the
above two functions.
Signed-off-by: NHantzis Fotis <xantzis@ceid.upatras.gr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee7537b6

02 3月, 2009 14 次提交

tcp: get rid of two unnecessary u16s in TCP skb flags copying · 9ce01461

由 Ilpo Järvinen 提交于 2月 28, 2009

I guess these fields were one day 16-bit in the struct but
nowadays they're just using 8 bits anyway.

This is just a precaution, didn't result any change in my
case but who knows what all those varying gcc versions &
options do. I've been told that 16-bit is not so nice with
some cpus.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ce01461

tcp: in sendmsg/pages open code the real goto target · 0d6a775e

由 Ilpo Järvinen 提交于 2月 28, 2009

copied was assigned zero right before the goto, so if (copied)
cannot ever be true.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d6a775e

tcp: kill eff_sacks "cache", the sole user can calculate itself · cabeccbd

由 Ilpo Järvinen 提交于 2月 28, 2009

Also fixes insignificant bug that would cause sending of stale
SACK block (would occur in some corner cases).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cabeccbd

tcp: add helper for AI algorithm · 758ce5c8

由 Ilpo Järvinen 提交于 2月 28, 2009

It seems that implementation in yeah was inconsistent to what
other did as it would increase cwnd one ack earlier than the
others do.

Size benefits:

  bictcp_cong_avoid |  -36
  tcp_cong_avoid_ai |  +52
  bictcp_cong_avoid |  -34
  tcp_scalable_cong_avoid |  -36
  tcp_veno_cong_avoid |  -12
  tcp_yeah_cong_avoid |  -38

= -104 bytes total
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

758ce5c8

htcp: merge icsk_ca_state compare · 571a5dd8

由 Ilpo Järvinen 提交于 2月 28, 2009

Similar to what is done elsewhere in TCP code when double
state checks are being done.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

571a5dd8

tcp: drop unnecessary local var in collapse · e6c7d085

由 Ilpo Järvinen 提交于 2月 28, 2009

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6c7d085

tcp: cleanup ca_state mess in tcp_timer · bc079e9e

由 Ilpo Järvinen 提交于 2月 28, 2009

Redundant checks made indentation impossible to follow.
However, it might be useful to make this ca_state+is_sack
indexed array.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc079e9e

tcp: separate timeout marking loop to it's own function · 7363a5b2

由 Ilpo Järvinen 提交于 2月 28, 2009

Some comment about its current state added. So far I have
seen very few cases where the thing is actually useful,
usually just marginally (though admittedly I don't usually
see top of window losses where it seems possible that there
could be some gain), instead, more often the cases suffer
from L-marking spike which is certainly not desirable
(I'll bury improving it to my todo list, but on a low
prio position).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7363a5b2

tcp: remove redundant code from tcp_mark_lost_retrans · d0af4160

由 Ilpo Järvinen 提交于 2月 28, 2009

Arnd Hannemann <hannemann@nets.rwth-aachen.de> noticed and was
puzzled by the fact that !tcp_is_fack(tp) leads to early return
near the beginning and the later on tcp_is_fack(tp) was still
used in an if condition. The later check was a left-over from
RFC3517 SACK stuff (== !tcp_is_fack(tp) behavior nowadays) as
there wasn't clear way how to handle this particular check
cheaply in the spirit of RFC3517 (using only SACK blocks, not
holes + SACK blocks as with FACK). I sort of left it there as
a reminder but since it's confusing other people just remove
it and comment the missing-feature stuff instead.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0af4160

tcp: fix corner case issue in segmentation during rexmitting · 02276f3c

由 Ilpo Järvinen 提交于 2月 28, 2009

If cur_mss grew very recently so that the previously G/TSOed skb
now fits well into a single segment it would get send up in
parts unless we calculate # of segments again. This corner-case
could happen eg. after mtu probe completes or less than
previously sack blocks are required for the opposite direction.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02276f3c

tcp: Don't clear hints when tcp_fragmenting · d3d2ae45

由 Ilpo Järvinen 提交于 2月 28, 2009

1) We didn't remove any skbs, so no need to handle stale refs.

2) scoreboard_skb_hint is trivial, no timestamps were changed
   so no need to clear that one

3) lost_skb_hint needs tweaking similar to that of
   tcp_sacktag_one().
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3d2ae45

tcp: deferring in middle of queue makes very little sense · 62ad2761

由 Ilpo Järvinen 提交于 2月 28, 2009

If skb can be sent right away, we certainly should do that
if it's in the middle of the queue because it won't get
more data into it.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62ad2761

tcp: fix lost_cnt_hint miscounts · 59a08cba

由 Ilpo Järvinen 提交于 2月 28, 2009

It is possible that lost_cnt_hint gets underflow in
tcp_clean_rtx_queue because the cumulative ACK can cover
the segment where lost_skb_hint points to only partially,
which means that the hint is not cleared, opposite to what
my (earlier) comment claimed.

Also I don't agree what I ended up writing about non-trivial
case there to be what I intented to say. It was not supposed
to happen that the hint won't get cleared and we underflow
in any scenario.

In general, this is quite hard to trigger in practice.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59a08cba

tcp: don't backtrack to sacked skbs · ac11ba75

由 Ilpo Järvinen 提交于 2月 28, 2009

Backtracking to sacked skbs is a horrible performance killer
since the hint cannot be advanced successfully past them...
...And it's totally unnecessary too.

In theory this is 2.6.27..28 regression but I doubt anybody
can make .28 to have worse performance because of other TCP
improvements.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac11ba75

01 3月, 2009 1 次提交

tcp: fix retrans_out leaks · 9ec06ff5

由 Ilpo Järvinen 提交于 3月 01, 2009

There's conflicting assumptions in shifting, the caller assumes
that dupsack results in S'ed skbs (or a part of it) for sure but
never gave a hint to tcp_sacktag_one when dsack is actually in
use. Thus DSACK retrans_out -= pcount was not taken and the
counter became out of sync. Remove obstacle from that information
flow to get DSACKs accounted in tcp_sacktag_one as expected.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ec06ff5

27 2月, 2009 1 次提交

inet fragments: fix sparse warning: context imbalance · 56bca31f

由 Hannes Eder 提交于 2月 25, 2009

Impact: Attribute function with __releases(...)

Fix this sparse warning:
  net/ipv4/inet_fragment.c:276:35: warning: context imbalance in 'inet_frag_find' - unexpected unlock
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56bca31f

25 2月, 2009 4 次提交

ipip: used time_before for comparing jiffies · 26d94b46

由 Wei Yongjun 提交于 2月 24, 2009

The functions time_before is more robust for comparing
jiffies against other values.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26d94b46

gre: used time_before for comparing jiffies · da6185d8

由 Wei Yongjun 提交于 2月 24, 2009

The functions time_before is more robust for comparing
jiffies against other values.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da6185d8

netlink: change nlmsg_notify() return value logic · 1ce85fe4

由 Pablo Neira Ayuso 提交于 2月 24, 2009

This patch changes the return value of nlmsg_notify() as follows:

If NETLINK_BROADCAST_ERROR is set by any of the listeners and
an error in the delivery happened, return the broadcast error;
else if there are no listeners apart from the socket that
requested a change with the echo flag, return the result of the
unicast notification. Thus, with this patch, the unicast
notification is handled in the same way of a broadcast listener
that has set the NETLINK_BROADCAST_ERROR socket flag.

This patch is useful in case that the caller of nlmsg_notify()
wants to know the result of the delivery of a netlink notification
(including the broadcast delivery) and take any action in case
that the delivery failed. For example, ctnetlink can drop packets
if the event delivery failed to provide reliable logging and
state-synchronization at the cost of dropping packets.

This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ce85fe4

tcp_scalable: Update malformed & dead url · a52b8bd3

由 Joe Perches 提交于 2月 24, 2009

Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a52b8bd3

24 2月, 2009 1 次提交

Doc: Refer to ip-sysctl.txt for strict vs. loose rp_filter mode · d18921a0

由 Jesper Dangaard Brouer 提交于 2月 23, 2009

The IP_ADVANCED_ROUTER Kconfig describes the rp_filter
proc option.  Recent changes added a loose mode.
Instead of documenting this change too places, refer to
the document describing it:
 Documentation/networking/ip-sysctl.txt

I'm considering moving the rp_filter description away
from the Kconfig file into ip-sysctl.txt.
Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d18921a0

23 2月, 2009 6 次提交

tcp: Like icmp use register_pernet_subsys · 6a1b3054

由 Eric W. Biederman 提交于 2月 22, 2009

To remove the possibility of packets flying around when network
devices are being cleaned up use reisger_pernet_subsys instead of
register_pernet_device.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a1b3054

netns: Fix icmp shutdown. · 959d2726

由 Eric W. Biederman 提交于 2月 22, 2009

Recently I had a kernel panic in icmp_send during a network namespace
cleanup.  There were packets in the arp queue that failed to be sent
and we attempted to generate an ICMP host unreachable message, but
failed because icmp_sk_exit had already been called.

The network devices are removed from a network namespace and their
arp queues are flushed before we do attempt to shutdown subsystems
so this error should have been impossible.

It turns out icmp_init is using register_pernet_device instead
of register_pernet_subsys.  Which resulted in icmp being shut down
while we still had the possibility of packets in flight, making
a nasty NULL pointer deference in interrupt context possible.

Changing this to register_pernet_subsys fixes the problem in
my testing.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

959d2726

ipv4: Clean whitespaces in net/ipv4/Kconfig. · a6e8f27f

由 Jesper Dangaard Brouer 提交于 2月 22, 2009

While going through net/ipv4/Kconfig cleanup whitespaces.
Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6e8f27f

ipv4: Fix rp_filter description in net/ipv4/Kconfig. · b2cc46a8

由 Jesper Dangaard Brouer 提交于 2月 22, 2009

The reverse path filter (rp_filter) will NOT get enabled
when enabling forwarding.  Read the code and tested in
in practice.

Most distributions do enable it in startup scripts.
Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2cc46a8

ip: ipip compile warning · 5747a1aa

由 Stephen Hemminger 提交于 2月 22, 2009

Get rid of compile warning about non-const format
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5747a1aa

ip: add loose reverse path filtering · c1cf8422

由 Stephen Hemminger 提交于 2月 20, 2009

Extend existing reverse path filter option to allow strict or loose
filtering. (See http://en.wikipedia.org/wiki/Reverse_path_filtering).

For compatibility with existing usage, the value 1 is chosen for strict mode
and 2 for loose mode.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1cf8422