提交 · 985990137e81ca9fd6561cd0f7d1a9695ec57d5a · openeuler / raspberrypi-kernel

21 10月, 2005 1 次提交

[TCP] Allow len == skb->len in tcp_fragment · b2cc99f0

由 Herbert Xu 提交于 10月 20, 2005

It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

b2cc99f0

14 10月, 2005 2 次提交

[TCP]: Ratelimit debugging warning. · 046d20b7

由 Herbert Xu 提交于 10月 13, 2005

Better safe than sorry.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

046d20b7

[NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering. · c8923c6b

由 David S. Miller 提交于 10月 13, 2005

Original patch by Harald Welte, with feedback from Herbert Xu
and testing by Sébastien Bernard.

EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus
are numbered linearly.  That is not necessarily true.

This patch fixes that up by calculating the largest possible
cpu number, and allocating enough per-cpu structure space given
that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8923c6b

13 10月, 2005 1 次提交

[TCP]: Add code to help track down "BUG at net/ipv4/tcp_output.c:438!" · 9ff5c59c

由 Herbert Xu 提交于 10月 12, 2005

This is the second report of this bug.  Unfortunately the first
reporter hasn't been able to reproduce it since to provide more
debugging info.

So let's apply this patch for 2.6.14 to

1) Make this non-fatal.
2) Provide the info we need to track it down.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ff5c59c

11 10月, 2005 11 次提交

[TWSK]: Grab the module refcount for timewait sockets · eeb2b856

由 Arnaldo Carvalho de Melo 提交于 10月 10, 2005

This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eeb2b856

[NETFILTER] ctnetlink: add support to change protocol info · 061cb4a0

由 Pablo Neira Ayuso 提交于 10月 10, 2005

This patch add support to change the state of the private protocol
information via conntrack_netlink.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

061cb4a0

[NETFILTER] ctnetlink: allow userspace to change TCP state · 33923153

由 Pablo Neira Ayuso 提交于 10月 10, 2005

This patch adds the ability of changing the state a TCP connection. I know
that this must be used with care but it's required to provide a complete
conntrack creation via conntrack_netlink. So I'll document this aspect on
the upcoming docs.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33923153

[NETFILTER]: Use only 32bit counters for CONNTRACK_ACCT · a051a8f7

由 Harald Welte 提交于 10月 10, 2005

Initially we used 64bit counters for conntrack-based accounting, since we
had no event mechanism to tell userspace that our counters are about to
overflow.  With nfnetlink_conntrack, we now have such a event mechanism and
thus can save 16bytes per connection.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a051a8f7

[IPSEC] Fix block size/MTU bugs in ESP · d4875b04

由 Herbert Xu 提交于 10月 10, 2005

This patch fixes the following bugs in ESP:

* Fix transport mode MTU overestimate.  This means that the inner MTU
  is smaller than it needs be.  Worse yet, given an input MTU which
  is a multiple of 4 it will always produce an estimate which is not
  a multiple of 4.

  For example, given a standard ESP/3DES/MD5 transform and an MTU of
  1500, the resulting MTU for transport mode is 1462 when it should
  be 1464.

  The reason for this is because IP header lengths are always a multiple
  of 4 for IPv4 and 8 for IPv6.

* Ensure that the block size is at least 4.  This is required by RFC2406
  and corresponds to what the esp_output function does.  At the moment
  this only affects crypto_null as its block size is 1.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4875b04

[IPSEC]: Use ALIGN macro in ESP · a02a6422

由 Herbert Xu 提交于 10月 10, 2005

This patch uses the macro ALIGN in all the applicable spots for ESP.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a02a6422

[NETFILTER] ctnetlink: add one nesting level for TCP state · e1c73b78

由 Pablo Neira Ayuso 提交于 10月 10, 2005

To keep consistency, the TCP private protocol information is nested
attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to
access the TCP state information looks like here below:

CTA_PROTOINFO
CTA_PROTOINFO_TCP
CTA_PROTOINFO_TCP_STATE

instead of:

CTA_PROTOINFO
CTA_PROTOINFO_TCP_STATE
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1c73b78

[NETFILTER] ctnetlink: ICMP ID is not mandatory · a1bcc3f2

由 Pablo Neira Ayuso 提交于 10月 10, 2005

The ID is only required by ICMP type 8 (echo), so it's not
mandatory for all sort of ICMP connections. This patch makes
mandatory only the type and the code for ICMP netlink messages.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1bcc3f2

[NETFILTER] conntrack_netlink: Fix endian issue with status from userspace · d000eaf7

由 Harald Welte 提交于 10月 10, 2005

When we send "status" from userspace, we forget to convert the endianness.
This patch adds the reqired conversion.  Thanks to Pablo Neira for
discovering this.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d000eaf7

[NETFILTER] ipt_ULOG: Mark ipt_ULOG as OBSOLETE · f40863ce

由 Harald Welte 提交于 10月 10, 2005

Similar to nfnetlink_queue and ip_queue, we mark ipt_ULOG as obsolete.
This should have been part of the original nfnetlink_log merge, but
I somehow missed it.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f40863ce

[NETFILTER] PPTP helper: Add missing Kconfig dependency · 85d9b05d

由 Harald Welte 提交于 10月 10, 2005

PPTP should not be selectable without conntrack enabled
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85d9b05d

09 10月, 2005 1 次提交

[PATCH] gfp flags annotations - part 1 · dd0fc66f

由 Al Viro 提交于 10月 07, 2005

 - added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd0fc66f

06 10月, 2005 1 次提交

[TCP]: BIC coding bug in Linux 2.6.13 · 42a39450

由 Stephen Hemminger 提交于 10月 05, 2005

Missing parenthesis in causes BIC to be slow in increasing congestion
window.

Spotted by Injong Rhee.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42a39450

05 10月, 2005 3 次提交

[IPVS]: fix sparse gfp nocast warnings · 8eea00a4

由 Randy Dunlap 提交于 10月 04, 2005

From: Randy Dunlap <rdunlap@xenotime.net>

Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eea00a4

[NETFILTER]: Fix Kconfig typo · a5181ab0

由 Horst H. von Brand 提交于 10月 04, 2005

Signed-off-by: NHorst H. von Brand <vonbrand@inf.utfsm.cl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5181ab0

[IPV4]: fib_trie root-node expansion · e6308be8

由 Robert Olsson 提交于 10月 04, 2005

The patch below introduces special thresholds to keep root node in the trie 
large. This gives a flatter tree at the cost of a modest memory increase.
Overall it seems to be gain and this was also proposed by one the authors 
of the paper in recent a seminar.

Main table after loading 123 k routes.

	Aver depth:     3.30
	Max depth:      9
        Root-node size  12 bits
        Total size: 4044  kB

With the patch:
	Aver depth:     2.78
	Max depth:      8
        Root-node size  15 bits
        Total size: 4150  kB

An increase of 8-10% was seen in forwading performance for an rDoS attack. 
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6308be8

04 10月, 2005 5 次提交

[IPV4]: Update icmp sysctl docs and disable broadcast ECHO/TIMESTAMP by default · 7ce31246

由 David S. Miller 提交于 10月 03, 2005

It's not a good idea to be smurf'able by default.
The few people who need this can turn it on.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ce31246

[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl · e5ed6399

由 Herbert Xu 提交于 10月 03, 2005

The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.

1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().

There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition.  I've marked it as such so that we remember to fix it.

This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5ed6399

[IPV4]: Fix "Proxy ARP seems broken" · 444fc8fc

由 Herbert Xu 提交于 10月 03, 2005

Meelis Roos <mroos@linux.ee> wrote:
> RK> My firewall setup relies on proxyarp working.  However, with 2.6.14-rc3,
> RK> it appears to be completely broken.  The firewall is 212.18.232.186,
> 
> Same here with some kernel between 14-rc2 and 14-rc3 - no reposnse to
> ARP on a proxyarp gateway. Sorry, no exact revison and no more debugging
> yet since it'a a production gateway.

The breakage is caused by the change to use the CB area for flagging
whether a packet has been queued due to proxy_delay.  This area gets
cleared every time arp_rcv gets called.  Unfortunately packets delayed
due to proxy_delay also go through arp_rcv when they are reprocessed.

In fact, I can't think of a reason why delayed proxy packets should go
through netfilter again at all.  So the easiest solution is to bypass
that and go straight to arp_process.

This is essentially what would've happened before netfilter support
was added to ARP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

444fc8fc

[INET]: speedup inet (tcp/dccp) lookups · 81c3d547

由 Eric Dumazet 提交于 10月 03, 2005

Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &head->chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))
     ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
     ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))                 &&  \
     (inet_sk(__sk)->daddr          == (__saddr))   &&  \
     (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))


- Adds a prefetch(head->chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&head->lock);
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Acked-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81c3d547

[NET]: Fix packet timestamping. · 325ed823

由 Herbert Xu 提交于 10月 03, 2005

I've found the problem in general.  It affects any 64-bit
architecture.  The problem occurs when you change the system time.

Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base.  You then wind the clock back
by a day.  From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.

In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.

When we do overflow, we'll need a better solution of course.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

325ed823

30 9月, 2005 2 次提交

[TCP]: Don't over-clamp window in tcp_clamp_window() · 09e9ec87

由 Alexey Kuznetsov 提交于 9月 29, 2005

From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>

Handle better the case where the sender sends full sized
frames initially, then moves to a mode where it trickles
out small amounts of data at a time.

This known problem is even mentioned in the comments
above tcp_grow_window() in tcp_input.c, specifically:

...
 * The scheme does not work when sender sends good segments opening
 * window and then starts to feed us spagetti. But it should work
 * in common situations. Otherwise, we have to rely on queue collapsing.
...

When the sender gives full sized frames, the "struct sk_buff" overhead
from each packet is small.  So we'll advertize a larger window.
If the sender moves to a mode where small segments are sent, this
ratio becomes tilted to the other extreme and we start overrunning
the socket buffer space.

tcp_clamp_window() tries to address this, but it's clamping of
tp->window_clamp is a wee bit too aggressive for this particular case.

Fix confirmed by Ion Badulescu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09e9ec87

[TCP]: Revert · 01ff367e

由 David S. Miller 提交于 9月 29, 2005

But retain the comment fix.

Alexey Kuznetsov has explained the situation as follows:

--------------------

I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.

See?

So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:

	init_rcv_wnd(mss)=
	  min { init_cwnd(mss1)*mss1 for mss1 <= mss }

It is something sort of:

	if (mss < 1096)
		return mss*4;
	if (mss < 1096*2)
		return 1096*4;
	return mss*2;

(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)

I selected it differently giving more window than it is strictly
required.  Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst.  But besides that it is arbitrary, so I decided to give
slack space of one segment.

Actually, the logic was:

If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.

For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.

If mss is huge, force 2*mss. No slack space.

Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.

--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01ff367e

29 9月, 2005 1 次提交

[TCP]: Fix init_cwnd calculations in tcp_select_initial_window() · 6b251858

由 David S. Miller 提交于 9月 28, 2005

Match it up to what RFC2414 really specifies.
Noticed by Rick Jones.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b251858

27 9月, 2005 1 次提交

[NETFILTER]: Fix invalid module autoloading by splitting iptable_nat · 188bab3a

由 Harald Welte 提交于 9月 26, 2005

When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.

This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko).  Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend.  ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.

This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

188bab3a

25 9月, 2005 2 次提交

[NETFILTER] ip_conntrack: Update event cache when status changes · 8ddec746

由 Harald Welte 提交于 9月 24, 2005

The GRE, SCTP and TCP protocol helpers did not call
ip_conntrack_event_cache() when updating ct->status.  This patch adds
the respective calls.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ddec746

[NETFILTER]: Fix ip[6]t_NFQUEUE Kconfig dependency · d67b24c4

由 Harald Welte 提交于 9月 24, 2005

We have to introduce a separate Kconfig menu entry for the NFQUEUE targets.
They cannot "just" depend on nfnetlink_queue, since nfnetlink_queue could
be linked into the kernel, whereas iptables can be a module.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d67b24c4

23 9月, 2005 4 次提交

[NETFILTER] Fix conntrack event cache deadlock/oops · 1dfbab59

由 Harald Welte 提交于 9月 22, 2005

This patch fixes a number of bugs.  It cannot be reasonably split up in
multiple fixes, since all bugs interact with each other and affect the same
function:

Bug #1:
The event cache code cannot be called while a lock is held.  Therefore, the
call to ip_conntrack_event_cache() within ip_ct_refresh_acct() needs to be
moved outside of the locked section.  This fixes a number of 2.6.14-rcX
oops and deadlock reports.

Bug #2:
We used to call ct_add_counters() for unconfirmed connections without
holding a lock.  Since the add operations are not atomic, we could race
with another CPU.

Bug #3:
ip_ct_refresh_acct() lost REFRESH events in some cases where refresh
(and the corresponding event) are desired, but no accounting shall be
performed.  Both, evenst and accounting implicitly depended on the skb
parameter bein non-null.   We now re-introduce a non-accounting
"ip_ct_refresh()" variant to explicitly state the desired behaviour.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1dfbab59

[NETFILTER] Fix sparse endian warnings in pptp helper · 67497205

由 Alexey Dobriyan 提交于 9月 22, 2005

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67497205

[NETFILTER] fix DEBUG statement in PPTP helper · 0ae5d253

由 Harald Welte 提交于 9月 22, 2005

As noted by Alexey Dobriyan, the DEBUGP statement prints the wrong
callID.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ae5d253

[TCP]: Adjust Reno SACK estimate in tcp_fragment · 83ca28be

由 Herbert Xu 提交于 9月 22, 2005

Since the introduction of TSO pcount a year ago, it has been possible
for tcp_fragment() to cause packets_out to decrease.  Prior to that,
tcp_retrans_try_collapse() was the only way for that to happen on the
retransmission path.

When this happens with Reno, it is possible for sasked_out to become
invalid because it is only an estimate and not tied to any particular
packet on the retransmission queue.

Therefore we need to adjust sacked_out as well as left_out in the Reno
case.  The following patch does exactly that.

This bug is pretty difficult to trigger in practice though since you
need a SACKless peer with a retransmission that occurs just as the
cached MTU value expires.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83ca28be

21 9月, 2005 2 次提交

[TCP]: Set default congestion control correctly for incoming connections. · 7957aed7

由 Stephen Hemminger 提交于 9月 21, 2005

Patch from Joel Sing to fix the default congestion control algorithm
for incoming connections. If a new congestion control handler is added
(via module), it should become the default for new
connections. Instead, the incoming connections use reno. The cause is
incorrect initialisation causes the tcp_init_congestion_control()
function to return after the initial if test fails.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Acked-by: NIan McDonald <imcdnzl@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7957aed7

[FIB_TRIE]: message cleanup · 78c6671a

由 Stephen Hemminger 提交于 9月 21, 2005

Cleanup the printk's in fib_trie:
	* Convert a couple of places in the dump code to BUG_ON
	* Put log level's on each message
The version message really needed the message since it leaks out
on the pretty Fedora bootup.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Acked-by: Robert Olsson <Robert.Olsson@data.slu.se>,
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78c6671a

20 9月, 2005 3 次提交

[PATCH] raw_sendmsg DoS on 2.6 · 6d1cfe3f

由 Mark J Cox 提交于 9月 19, 2005

Fix unchecked __get_user that could be tricked into generating a
memory read on an arbitrary address.  The result of the read is not
returned directly but you may be able to divine some information about
it, or use the read to cause a crash on some architectures by reading
hardware state.  CAN-2004-2492.

Fix from Al Viro, ack from Dave Miller.
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6d1cfe3f

[TCP]: Handle SACK'd packets properly in tcp_fragment(). · e14c3caf

由 Herbert Xu 提交于 9月 19, 2005

The problem is that we're now calling tcp_fragment() in a context
where the packets might be marked as SACKED_ACKED or SACKED_RETRANS.
This was not possible before as you never retransmitted packets that
are so marked.

Because of this, we need to adjust sacked_out and retrans_out in
tcp_fragment().  This is exactly what the following patch does.

We also need to preserve the SACKED_ACKED/SACKED_RETRANS marking
if they exist.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e14c3caf

[NETFILTER]: Export ip_nat_port_{nfattr_to_range,range_to_nfattr} · 8922bc93

由 Harald Welte 提交于 9月 19, 2005

Those exports are needed by the PPTP helper following in the next
couple of changes.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8922bc93