提交 · 723884339f90a9c420783135168cc1045750eb5d · openeuler / raspberrypi-kernel

05 9月, 2009 6 次提交

sctp: Sysctl configuration for IPv4 Address Scoping · 72388433

由 Bhaskar Dutta 提交于 9月 03, 2009

This patch introduces a new sysctl option to make IPv4 Address Scoping
configurable <draft-stewart-tsvwg-sctp-ipv4-00.txt>.

In networking environments where DNAT rules in iptables prerouting
chains convert destination IP's to link-local/private IP addresses,
SCTP connections fail to establish as the INIT chunk is dropped by the
kernel due to address scope match failure.
For example to support overlapping IP addresses (same IP address with
different vlan id) a Layer-5 application listens on link local IP's,
and there is a DNAT rule that maps the destination IP to a link local
IP. Such applications never get the SCTP INIT if the address-scoping
draft is strictly followed.

This sysctl configuration allows SCTP to function in such
unconventional networking environments.

Sysctl options:
0 - Disable IPv4 address scoping draft altogether
1 - Enable IPv4 address scoping (default, current behavior)
2 - Enable address scoping but allow IPv4 private addresses in init/init-ack
3 - Enable address scoping but allow IPv4 link local address in init/init-ack
Signed-off-by: NBhaskar Dutta <bhaskar.dutta@globallogic.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

72388433

V
sctp: Turn flags in 'sctp_packet' into bit fields · a803c942
由 Vlad Yasevich 提交于 9月 04, 2009
```
This shrinks the size of sctp_packet a little.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
```
a803c942

sctp: Fix SCTP_MAXSEG socket option to comply to spec. · f68b2e05

由 Vlad Yasevich 提交于 9月 04, 2009

We had a bug that we never stored the user-defined value for
MAXSEG when setting the value on an association.  Thus future
PMTU events ended up re-writing the frag point and increasing
it past user limit.  Additionally, when setting the option on
the socket/endpoint, we effect all current associations, which
is against spec.

Now, we store the user 'maxseg' value along with the computed
'frag_point'.  We inherit 'maxseg' from the socket at association
creation and use it as an upper limit for 'frag_point' when its
set.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

f68b2e05

sctp: Don't do NAGLE delay on large writes that were fragmented small · cb95ea32

由 Vlad Yasevich 提交于 9月 04, 2009

SCTP will delay the last part of a large write due to NAGLE, if that
part is smaller then MTU. Since we are doing large writes, we might
as well send the last portion now instead of waiting untill the next
large write happens. The small portion will be sent as is regardless,
so it's better to not delay it.

This is a result of much discussions with Wei Yongjun <yjwei@cn.fujitsu.com>
and Doug Graham <dgraham@nortel.com>. Many thanks go out to them.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

cb95ea32

sctp: drop a_rwnd to 0 when receive buffer overflows. · 4d3c46e6

由 Vlad Yasevich 提交于 9月 04, 2009

SCTP has a problem that when small chunks are used, it is possible
to exhaust the receiver buffer without fully closing receive window.
This happens due to all overhead that we have account for with small
messages. To fix this, when receive buffer is exceeded, we'll drop
the window to 0 and save the 'drop' portion. When application starts
reading data and freeing up recevie buffer space, we'll wait until
we've reached the 'drop' window and then add back this 'drop' one
mtu at a time. This worked well in testing and under stress produced
rather even recovery.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

4d3c46e6

sctp: Send user messages to the lower layer as one · 9c5c62be

由 Vlad Yasevich 提交于 8月 10, 2009

Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself.  This means
that for each fragment we go all the way down the stack
and back up.  This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).

We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine.  The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

9c5c62be

03 6月, 2009 1 次提交

sctp: fix to choose alternate destination when retransmit ASCONF chunk · 9919b455

由 Wei Yongjun 提交于 5月 12, 2009

RFC 5061 Section 5.1 ASCONF Chunk Procedures said:

B4)  Re-transmit the ASCONF Chunk last sent and if possible choose an
     alternate destination address (please refer to [RFC4960],
     Section 6.4.1).  An endpoint MUST NOT add new parameters to this
     chunk; it MUST be the same (including its Sequence Number) as
     the last ASCONF sent.  An endpoint MAY, however, bundle an
     additional ASCONF with new ASCONF parameters with the next
     Sequence Number.  For details, see Section 5.5.

This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

9919b455

16 2月, 2009 2 次提交

sctp: Fix the RTO-doubling on idle-link heartbeats · faee47cd

由 Vlad Yasevich 提交于 2月 13, 2009

SCTP incorrectly doubles rto ever time a Hearbeat chunk
is generated.   However RFC 4960 states:

   On an idle destination address that is allowed to heartbeat, it is
   recommended that a HEARTBEAT chunk is sent once per RTO of that
   destination address plus the protocol parameter 'HB.interval', with
   jittering of +/- 50% of the RTO value, and exponential backoff of the
   RTO if the previous HEARTBEAT is unanswered.

Essentially, of if the heartbean is unacknowledged, do we double the RTO.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

faee47cd

sctp: Allow to disable SCTP checksums via module parameter · 06e86806

由 Lucas Nussbaum 提交于 2月 13, 2009

This is a new version of my patch, now using a module parameter instead
of a sysctl, so that the option is harder to find. Please note that,
once the module is loaded, it is still possible to change the value of
the parameter in /sys/module/sctp/parameters/, which is useful if you
want to do performance comparisons without rebooting.

Computation of SCTP checksums significantly affects the performance of
SCTP. For example, using two dual-Opteron 246 connected using a Gbe
network, it was not possible to achieve more than ~730 Mbps, compared to
941 Mbps after disabling SCTP checksums.
Unfortunately, SCTP checksum offloading in NICs is not commonly
available (yet).

By default, checksums are still enabled, of course.
Signed-off-by: NLucas Nussbaum <lucas.nussbaum@ens-lyon.fr>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06e86806

09 10月, 2008 1 次提交

sctp: Rework the tsn map to use generic bitmap. · 8e1ee18c

由 Vlad Yasevich 提交于 10月 08, 2008

The tsn map currently use is 4K large and is stuck inside
the sctp_association structure making memory references REALLY
expensive.  What we really need is at most 4K worth of bits
so the biggest map we would have is 512 bytes.   Also, the
map is only really usefull when we have gaps to store and
report.  As such, starting with minimal map of say 32 TSNs (bits)
should be enough for normal low-loss operations.  We can grow
the map by some multiple of 32 along with some extra room any
time we receive the TSN which would put us outside of the map
boundry.  As we close gaps, we can shift the map to rebase
it on the latest TSN we've seen.  This saves 4088 bytes per
association just in the map alone along savings from the now
unnecessary structure members.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e1ee18c

01 10月, 2008 2 次提交

sctp: try harder to figure out address family when checking wildcards · 52cae8f0

由 Vlad Yasevich 提交于 8月 18, 2008

sctp_is_any() function that is used to check for wildcard addresses
only looks at the address itself to determine the address family.
This function is used in the API to check the address passed in from
the user. If the user simply zerroes out the sockaddr_storage and
pass that in, we'll end up failing. So, let's try harder to determine
the address family by also checking the socket if it's possible.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

52cae8f0

sctp: reduce memory footprint of sctp_chunk structure · c226ef9b

由 Neil Horman 提交于 7月 25, 2008

sctp_chunks should be put on a diet.  This is some of the low hanging
fruit that we can strip out.  Changes all the __s8/__u8 flags to
bitfields.  Saves 12 bytes per chunk.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

c226ef9b

04 8月, 2008 1 次提交

sctp: Drop ipfargok in sctp_xmit function · f880374c

由 Herbert Xu 提交于 8月 03, 2008

The ipfragok flag controls whether the packet may be fragmented
either on the local host on beyond.  The latter is only valid on
IPv4.

In fact, we never want to do the latter even on IPv4 when PMTU is
enabled.  This is because even though we can't fragment packets
within SCTP due to the prtocol's inherent faults, we can still
fragment it at IP layer.  By setting the DF bit we will improve
the PMTU process.

RFC 2960 only says that we SHOULD clear the DF bit in this case,
so we're compliant even if we set the DF bit.  In fact RFC 4960
no longer has this statement.

Once we make this change, we only need to control the local
fragmentation.  There is already a bit in the skb which controls
that, local_df.  So this patch sets that instead of using the
ipfragok argument.

The only complication is that there isn't a struct sock object
per transport, so for IPv4 we have to resort to changing the
pmtudisc field for every packet.  This should be safe though
as the protocol is single-threaded.

Note that after this patch we can remove ipfragok from the rest
of the stack too.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f880374c

23 7月, 2008 1 次提交

sctp: make sctp_outq_flush() static · abd0b198

由 Adrian Bunk 提交于 7月 22, 2008

sctp_outq_flush() can now become static.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

abd0b198

19 7月, 2008 1 次提交

sctp: Support ipv6only AF_INET6 sockets. · 7dab83de

由 Vlad Yasevich 提交于 7月 18, 2008

Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7dab83de

20 6月, 2008 1 次提交

sctp: Follow security requirement of responding with 1 packet · 2e3216cd

由 Vlad Yasevich 提交于 6月 19, 2008

RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

When an SCTP stack receives a packet containing multiple control or
DATA chunks and the processing of the packet requires the sending of
multiple chunks in response, the sender of the response chunk(s) MUST
NOT send more than one packet.  If bundling is supported, multiple
response chunks that fit into a single packet MAY be bundled together
into one single response packet.  If bundling is not supported, then
the sender MUST NOT send more than one response chunk and MUST
discard all other responses.  Note that this rule does NOT apply to a
SACK chunk, since a SACK chunk is, in itself, a response to DATA and
a SACK does not require a response of more DATA.

We implement this by not servicing our outqueue until we reach the end
of the packet.  This enables maximum bundling.  We also identify
'response' chunks and make sure that we only send 1 packet when sending
such chunks.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e3216cd

05 6月, 2008 4 次提交

sctp: Fix ECN markings for IPv6 · b9031d9d

由 Vlad Yasevich 提交于 6月 04, 2008

Commit e9df2e8f ("[IPV6]: Use
appropriate sock tclass setting for routing lookup.") also changed the
way that ECN capable transports mark this capability in IPv6.  As a
result, SCTP was not marking ECN capablity because the traffic class
was never set.  This patch brings back the markings for IPv6 traffic.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9031d9d

sctp: Start T3-RTX timer when fast retransmitting lowest TSN · 62aeaff5

由 Vlad Yasevich 提交于 6月 04, 2008

When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62aeaff5

sctp: Correctly implement Fast Recovery cwnd manipulations. · a6465234

由 Vlad Yasevich 提交于 6月 04, 2008

Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6465234

[SCTP]: Fix NULL dereference of asoc. · e5117101

由 YOSHIFUJI Hideaki 提交于 5月 29, 2008

Commit 7cbca67c ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

e5117101

10 5月, 2008 1 次提交

sctp: Bring SCTP_DELAYED_ACK socket option into API compliance · d364d927

由 Wei Yongjun 提交于 5月 09, 2008

Brings delayed_ack socket option set/get into line with the latest ietf
socket extensions API draft, while maintaining backwards compatibility.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d364d927

24 3月, 2008 1 次提交

[SCTP]: Remove redundant wrapper functions. · 80445cfb

由 Florian Westphal 提交于 3月 23, 2008

sctp_datamsg_free and sctp_datamsg_track are just aliases for
sctp_datamsg_put and sctp_chunk_hold, respectively.

Saves 32 Bytes on x86.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80445cfb

01 3月, 2008 1 次提交

[SCTP]: extend exported data in /proc/net/sctp/assoc · 58fbbed4

由 Neil Horman 提交于 2月 29, 2008

RFC 3873 specifies several MIB objects that can't be obtained by the
current data set exported by /proc/sys/net/sctp/assoc.  This patch
adds the missing pieces of data that allow us to compute all the
objects in the sctpAssocTable object.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58fbbed4

05 2月, 2008 1 次提交

[SCTP]: Stop claiming that this is a "reference implementation" · 60c778b2

由 Vlad Yasevich 提交于 1月 11, 2008

I was notified by Randy Stewart that lksctp claims to be
"the reference implementation".  First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>

60c778b2

29 1月, 2008 5 次提交

[SCTP]: Implement ADD-IP special case processing for ABORT chunk · 75205f47