提交 · ceb1eec8291175686d0208e66595ff83bc0624e2 · openeuler / Kernel

11 10月, 2007 40 次提交

[IPSEC]: Move IP length/checksum setting out of transforms · ceb1eec8

由 Herbert Xu 提交于 10月 10, 2007

This patch moves the setting of the IP length and checksum fields out of
the transforms and into the xfrmX_output functions.  This would help future
efforts in merging the transforms themselves.

It also adds an optimisation to ipcomp due to the fact that the transport
offset is guaranteed to be zero.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ceb1eec8

[IPSEC]: Get rid of ipv6_{auth,esp,comp}_hdr · 87bdc48d

由 Herbert Xu 提交于 10月 10, 2007

This patch removes the duplicate ipv6_{auth,esp,comp}_hdr structures since
they're identical to the IPv4 versions.  Duplicating them would only create
problems for ourselves later when we need to add things like extended
sequence numbers.

I've also added transport header type conversion headers for these types
which are now used by the transforms.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87bdc48d

[IPSEC]: Use IPv6 calling convention as the convention for x->mode->output · 37fedd3a

由 Herbert Xu 提交于 10月 10, 2007

The IPv6 calling convention for x->mode->output is more general and could
help an eventual protocol-generic x->type->output implementation.  This
patch adopts it for IPv4 as well and modifies the IPv4 type output functions
accordingly.

It also rewrites the IPv6 mac/transport header calculation to be based off
the network header where practical.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37fedd3a

[IPSEC]: Set skb->data to payload in x->mode->output · 7b277b1a

由 Herbert Xu 提交于 10月 10, 2007

This patch changes the calling convention so that on entry from
x->mode->output and before entry into x->type->output skb->data
will point to the payload instead of the IP header.

This is essentially a redistribution of skb_push/skb_pull calls
with the aim of minimising them on the common path of tunnel +
ESP.

It'll also let us use the same calling convention between IPv4
and IPv6 with the next patch.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b277b1a

[IPSEC] esp: Remove NAT-T checksum invalidation for BEET · 8bd17075

由 Herbert Xu 提交于 10月 10, 2007

I pointed this out back when this patch was first proposed but it looks like
it got lost along the way.

The checksum only needs to be ignored for NAT-T in transport mode where
we lose the original inner addresses due to NAT.  With BEET the inner
addresses will be intact so the checksum remains valid.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bd17075

[TCP]: Separate lost_retrans loop into own function · 1c1e87ed

由 Ilpo Järvinen 提交于 10月 10, 2007

Follows own function for each task principle, this is really
somewhat separate task being done in sacktag. Also reduces
indentation.

In addition, added ack_seq local var to break some long
lines & fixed coding style things.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c1e87ed

[NETFILTER]: Make netfilter code use the seq_open_private · e2da5913

由 Pavel Emelyanov 提交于 10月 10, 2007

Just switch to the consolidated calls.

ipt_recent() has to initialize the private, so use
the __seq_open_private() helper.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2da5913

[NET]: Make core networking code use seq_open_private · cf7732e4

由 Pavel Emelyanov 提交于 10月 10, 2007

This concerns the ipv4 and ipv6 code mostly, but also the netlink
and unix sockets.

The netlink code is an example of how to use the __seq_open_private()
call - it saves the net namespace on this private.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf7732e4

[IPSEC]: Move state lock into x->type->output · b7c6538c

由 Herbert Xu 提交于 10月 09, 2007

This patch releases the lock on the state before calling x->type->output.
It also adds the lock to the spots where they're currently needed.

Most of those places (all except mip6) are expected to disappear with
async crypto.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7c6538c

[IPSEC]: Store IPv6 nh pointer in mac_header on output · 007f0211

由 Herbert Xu 提交于 10月 09, 2007

Current the x->mode->output functions store the IPv6 nh pointer in the
skb network header. This is inconvenient because the network header then
has to be fixed up before the packet can leave the IPsec stack. The mac
header field is unused on output so we can use that to store this instead.

This patch does that and removes the network header fix-up in xfrm_output.

It also uses ipv6_hdr where appropriate in the x->type->output functions.

There is also a minor clean-up in esp4 to make it use the same code as
esp6 to help any subsequent effort to merge the two.

Lastly it kills two redundant skb_set_* statements in BEET that were
simply copied over from transport mode.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

007f0211

[IPSEC]: Move output replay code into xfrm_output · 436a0a40

由 Herbert Xu 提交于 10月 08, 2007

The replay counter is one of only two remaining things in the output code
that requires a lock on the xfrm state (the other being the crypto).  This
patch moves it into the generic xfrm_output so we can remove the lock from
the transforms themselves.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

436a0a40

[IPSEC]: Move common output code to xfrm_output · 406ef77c

由 Herbert Xu 提交于 10月 08, 2007

Most of the code in xfrm4_output_one and xfrm6_output_one are identical so
this patch moves them into a common xfrm_output function which will live
in net/xfrm.

In fact this would seem to fix a bug as on IPv4 we never reset the network
header after a transform which may upset netfilter later on.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

406ef77c

[IPSEC] ah: Remove keys from ah_data structure · bc31d3b2

由 Herbert Xu 提交于 10月 08, 2007

The keys are only used during initialisation so we don't need to carry them
in esp_data. Since we don't have to allocate them again, there is no need
to place a limit on the authentication key length anymore.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc31d3b2

[IPSEC] esp: Remove keys from esp_data structure · 4b7137ff

由 Herbert Xu 提交于 10月 08, 2007

The keys are only used during initialisation so we don't need to carry them
in esp_data.  Since we don't have to allocate them again, there is no need
to place a limit on the authentication key length anymore.

This patch also kills the unused auth.icv member.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b7137ff

[NET]: sparse warning fixes · cfcabdcc

由 Stephen Hemminger 提交于 10月 09, 2007

Fix a bunch of sparse warnings. Mostly about 0 used as
NULL pointer, and shadowed variable declarations.
One notable case was that hash size should have been unsigned.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfcabdcc

[TCP]: "Annotate" another fackets_out state reset · de83c058

由 Ilpo Järvinen 提交于 10月 07, 2007

This should no longer be necessary because fackets_out is
accurate. It indicates bugs elsewhere, thus report it.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de83c058

[TCP]: Fix two off-by-one errors in fackets_out adjusting logic · 29d0a309

由 Ilpo Järvinen 提交于 10月 07, 2007

1) Passing wrong skb to tcp_adjust_fackets_out could corrupt
fastpath_cnt_hint as tcp_skb_pcount(next_skb) is not included
to it if hint points exactly to the next_skb (it's lagging
behind, see sacktag).

2) When fastpath_skb_hint is put backwards to avoid dangling
skb reference, the skb's pcount must also be removed from count
(not included like above).

Reported by Cedric Le Goater <legoater@free.fr>
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29d0a309

[TCP]: Wrap-safed reordering detection FRTO check · 3de96471

由 Ilpo Järvinen 提交于 10月 01, 2007

In case somebody has a suggestion about a better place for this
check, which must guarantee execution "early enough" (i.e,
before the wrap can occur), I'm very open to them.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3de96471

[TCP]: Update comment of SACK block validator · 0e835331

由 Ilpo Järvinen 提交于 10月 01, 2007

Just came across what RFC2018 states about generation of valid
SACK blocks in case of reneging. Alter comment a bit to point
out clearly.

IMHO, there isn't any reason to change code because the
validation is there for a purpose (counters will inform user
about decision TCP made if this case ever surfaces).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e835331

[TCP]: fix comments that got messed up during code move · 95eacd27

由 Ilpo Järvinen 提交于 10月 01, 2007

Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95eacd27

[TCP]: No fackets_out/highest_sack tuning when SACK isn't enabled · dc86967b

由 Ilpo Järvinen 提交于 10月 01, 2007

This was found due to bug report from Cedric Le Goater though
it turned this turned out to be unrelated bug.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc86967b

[NETFILTER]: ctnetlink: use netlink policy · f73e924c

由 Patrick McHardy 提交于 9月 28, 2007

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f73e924c

[NETFILTER]: nfnetlink: rename functions containing 'nfattr' · fdf70832

由 Patrick McHardy 提交于 9月 28, 2007

There is no struct nfattr anymore, rename functions to 'nlattr'.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdf70832

[NETFILTER]: nfnetlink: convert to generic netlink attribute functions · df6fb868

由 Patrick McHardy 提交于 9月 28, 2007

Get rid of the duplicated rtnetlink macros and use the generic netlink
attribute functions. The old duplicated stuff is moved to a new header
file that exists just for userspace.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df6fb868

[NET]: Move hardware header operations out of netdevice. · 3b04ddde

由 Stephen Hemminger 提交于 10月 09, 2007

Since hardware header operations are part of the protocol class
not the device instance, make them into a separate object and
save memory.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b04ddde

[NET]: Wrap hard_header_parse · b95cce35

由 Stephen Hemminger 提交于 9月 26, 2007

Wrap the hard_header_parse function to simplify next step of
header_ops conversion.
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b95cce35

[NET]: Wrap netdevice hardware header creation. · 0c4e8581

由 Stephen Hemminger 提交于 10月 09, 2007

Add inline for common usage of hardware header creation, and
fix bug in IPV6 mcast where the assumption about negative return is
an errno. Negative return from hard_header means not enough space
was available,(ie -N bytes).
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c4e8581

[NET]: Make the loopback device per network namespace. · 2774c7ab

由 Eric W. Biederman 提交于 9月 26, 2007

This patch makes loopback_dev per network namespace.  Adding
code to create a different loopback device for each network
namespace and adding the code to free a loopback device
when a network namespace exits.

This patch modifies all users the loopback_dev so they
access it as init_net.loopback_dev, keeping all of the
code compiling and working.  A later pass will be needed to
update the users to use something other than the initial network
namespace.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2774c7ab

[IPV4]: When possible test for IFF_LOOPBACK and not dev == loopback_dev · 0cc217e1

由 Eric W. Biederman 提交于 9月 26, 2007

Now that multiple loopback devices are becoming possible it makes
the code a little cleaner and more maintainable to test if a deivice
is th a loopback device by testing dev->flags & IFF_LOOPBACK instead
of dev == loopback_dev.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cc217e1

[IPV4]: Remove unnecessary test for the loopback device from inetdev_destroy · 5967789d

由 Eric W. Biederman 提交于 9月 26, 2007

Currently we never call unregister_netdev for the loopback device so
it is impossible for us to reach inetdev_destroy with the loopback
device.  So the test in inetdev_destroy is unnecessary.

Further when testing with my network namespace patches removing
unregistering the loopback device and calling inetdev_destroy works
fine so there appears to be no reason for avoiding unregistering the
loopback device.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5967789d

I
[TCP] MIB: Count FRTO's successfully detected spurious RTOs · 912d8f0b
由 Ilpo Järvinen 提交于 9月 25, 2007
```
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
912d8f0b

[TCP]: Reordered ACK's (old) SACKs not included to discarded MIB · 93e68020

由 Ilpo Järvinen 提交于 9月 25, 2007

In case of ACK reordering, the SACK block might be valid in it's
time but is already obsoleted since we've received another kind
of confirmation about arrival of the segments through snd_una
advancement of an earlier packet.

I didn't bother to build distinguishing of valid and invalid
SACK blocks but simply made reordered SACK blocks that are too
old always not counted regardless of their "real" validity which
could be determined by using the ack field of the reordered
packet (won't be significant IMHO).

DSACKs can very well be considered useful even in this situation,
so won't do any of this for them.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93e68020

[TCP]: Re-place highest_sack check to a more robust position · a6963a6b

由 Ilpo Järvinen 提交于 9月 25, 2007

I previously added checking to position that is rather poor as
state has already been adjusted quite a bit. Re-placing it above
all state changes should be more robust though the return should
never ever get executed regardless of its place :-).
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6963a6b

[NET]: Dynamically allocate the loopback device, part 1. · de3cb747

由 Daniel Lezcano 提交于 9月 25, 2007

This patch replaces all occurences to the static variable
loopback_dev to a pointer loopback_dev. That provides the
mindless, trivial, uninteressting change part for the dynamic
allocation for the loopback.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Acked-By: NKirill Korotaev <dev@sw.ru>
Acked-by: NBenjamin Thery <benjamin.thery@bull.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de3cb747

[TCP]: Avoid clearing sacktag hint in trivial situations · b7689205

由 Ilpo Järvinen 提交于 9月 20, 2007

There's no reason to clear the sacktag skb hint when small part
of the rexmit queue changes. Account changes (if any) instead when
fragmenting/collapsing. RTO/FRTO do not touch SACKED_ACKED bits so
no need to discard SACK tag hint at all.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7689205

[TCP]: Enable SACK enhanced FRTO (RFC4138) by default · c96fd3d4

由 Ilpo Järvinen 提交于 9月 20, 2007

Most of the description that follows comes from my mail to
netdev (some editing done):

Main obstacle to FRTO use is its deployment as it has to be on
the sender side where as wireless link is often the receiver's
access link. Take initiative on behalf of unlucky receivers and
enable it by default in future Linux TCP senders. Also IETF
seems to interested in advancing FRTO from experimental [1].

How does FRTO help?
===================

FRTO detects spurious RTOs and avoids a number of unnecessary
retransmissions and a couple of other problems that can arise
due to incorrect guess made at RTO (i.e., that segments were
lost when they actually got delayed which is likely to occur
e.g. in wireless environments with link-layer retransmission).
Though FRTO cannot prevent the first (potentially unnecessary)
retransmission at RTO, I suspect that it won't cost that much
even if you have to pay for each bit (won't be that high
percentage out of all packets after all :-)). However, usually
when you have a spurious RTO, not only the first segment
unnecessarily retransmitted but the *whole window*. It goes like
this: all cumulative ACKs got delayed due to in-order delivery,
then TCP will actually send 1.5*original cwnd worth of data in
the RTO's slow-start when the delayed ACKs arrive (basically the
original cwnd worth of it unnecessarily). In case one is
interested in minimizing unnecessary retransmissions e.g. due to
cost, those rexmissions must never see daylight. Besides, in the
worst case the generated burst overloads the bottleneck buffers
which is likely to significantly delay the further progress of
the flow. In case of ll rexmissions, ACK compression often
occurs at the same time making the burst very "sharp edged" (in
that case TCP often loses most of the segments above high_seq
=> very bad performance too). When FRTO is enabled, those
unnecessary retransmissions are fully avoided except for the
first segment and the cwnd behavior after detected spurious RTO
is determined by the response (one can tune that by sysctl).

Basic version (non-SACK enhanced one), FRTO can fail to detect
spurious RTO as spurious and falls back to conservative
behavior. ACK lossage is much less significant than reordering,
usually the FRTO can detect spurious RTO if at least 2
cumulative ACKs from original window are preserved (excluding
the ACK that advances to high_seq). With SACK-enhanced version,
the detection is quite robust.

FRTO should remove the need to set a high lower bound for the
RTO estimator due to delay spikes that occur relatively common
in some environments (esp. in wireless/cellular ones).

[1] http://www1.ietf.org/mail-archive/web/tcpm/current/msg02862.htmlSigned-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c96fd3d4

[TCP] FRTO: Improve interoperability with other undo_marker users · 009a2e3e

由 Ilpo Järvinen 提交于 9月 20, 2007

Basically this change enables it, previously other undo_marker
users were left with nothing. Reverse undo_marker logic
completely to get it set right in CA_Loss. On the other hand,
when spurious RTO is detected, clear it. Clearing might be too
heavy for some scenarios but seems safe enough starting point
for now and shouldn't have much effect except in majority of
cases (if in any).

By adding a new FLAG_ we avoid looping through write_queue when
RTO occurs.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

009a2e3e

[TCP]: Cleanup tcp_tso_acked and tcp_clean_rtx_queue · 7c46a03e

由 Ilpo Järvinen 提交于 9月 20, 2007

Implements following cleanups:
- Comment re-placement (CodingStyle)
- tcp_tso_acked() local (wrapper-like) variable removal
  (readability)
- __-types removed (IMHO they make local variables jumpy looking
  and just was space)
- acked -> flag (naming conventions elsewhere in TCP code)
- linebreak adjustments (readability)
- nested if()s combined (reduced indentation)
- clarifying newlines added
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c46a03e

[TCP]: Move accounting from tso_acked to clean_rtx_queue · 13fcf850

由 Ilpo Järvinen 提交于 10月 09, 2007

The accounting code is pretty much the same, so it's a shame
we do it in two places.

I'm not too sure if added fully_acked check in MTU probing is
really what we want perhaps the added end_seq could be used in
the after() comparison.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13fcf850

[TCP]: clear_all_retrans_hints prefixed by tcp_ · 5af4ec23

由 Ilpo Järvinen 提交于 9月 20, 2007

In addition, fix its function comment spacing.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>

5af4ec23

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功