提交 · 00fa02334540ec795934737cd6e6ef8db2560731 · openeuler / Kernel

05 10月, 2005 8 次提交

[AF_KEY]: fix sparse gfp nocast warnings · 00fa0233

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in net/key code:
net/key/af_key.c:195:27: warning: implicit cast to nocast type
net/key/af_key.c:1439:28: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00fa0233

[NETFILTER]: fix sparse gfp nocast warnings · c6f4fafc

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in nfnetlink code:
net/netfilter/nfnetlink.c:204:43: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6f4fafc

[IPVS]: fix sparse gfp nocast warnings · 8eea00a4

由 Randy Dunlap 提交于 10月 04, 2005

From: Randy Dunlap <rdunlap@xenotime.net>

Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eea00a4

[DECNET]: fix sparse gfp nocast warnings · f4a19a56

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in decnet code:
net/decnet/af_decnet.c:458:40: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:125:35: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:219:29: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4a19a56

[ATM]: fix sparse gfp nocast warnings · 7b5b3f3d

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in atm code:
net/atm/atm_misc.c:35:44: warning: implicit cast to nocast type
drivers/atm/fore200e.c:183:33: warning: implicit cast to nocast type

Also use kzalloc() instead of kmalloc().
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b5b3f3d

[NETFILTER]: Fix Kconfig typo · a5181ab0

由 Horst H. von Brand 提交于 10月 04, 2005

Signed-off-by: NHorst H. von Brand <vonbrand@inf.utfsm.cl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5181ab0

[IPV4]: fib_trie root-node expansion · e6308be8

由 Robert Olsson 提交于 10月 04, 2005

The patch below introduces special thresholds to keep root node in the trie 
large. This gives a flatter tree at the cost of a modest memory increase.
Overall it seems to be gain and this was also proposed by one the authors 
of the paper in recent a seminar.

Main table after loading 123 k routes.

	Aver depth:     3.30
	Max depth:      9
        Root-node size  12 bits
        Total size: 4044  kB

With the patch:
	Aver depth:     2.78
	Max depth:      8
        Root-node size  15 bits
        Total size: 4150  kB

An increase of 8-10% was seen in forwading performance for an rDoS attack. 
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6308be8

[IPV6]: Fix infinite loop in udp_v6_get_port(). · 87bf9c97

由 YOSHIFUJI Hideaki 提交于 10月 04, 2005

Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87bf9c97

04 10月, 2005 11 次提交

[PATCH] ieee80211: fix gfp flags type · f36a29d5

由 Randy Dunlap 提交于 10月 03, 2005

Fix implicit nocast warnings in ieee80211 code, including __nocast:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NJeff Garzik <jgarzik@pobox.com>

f36a29d5

[PATCH] ieee80211: fix gfp flags type · 8cb6108b

由 Randy Dunlap 提交于 10月 02, 2005

Fix implicit nocast warnings in ieee80211 code:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NJeff Garzik <jgarzik@pobox.com>

8cb6108b

[IPV4]: Update icmp sysctl docs and disable broadcast ECHO/TIMESTAMP by default · 7ce31246

由 David S. Miller 提交于 10月 03, 2005

It's not a good idea to be smurf'able by default.
The few people who need this can turn it on.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ce31246

[IPV4]: Get rid of bogus __in_put_dev in pktgen · 3e56a40b

由 Herbert Xu 提交于 10月 03, 2005

This patch gets rid of a bogus __in_dev_put() in pktgen.c.  This was
spotted by Suzanne Wood.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e56a40b

[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl · e5ed6399

由 Herbert Xu 提交于 10月 03, 2005

The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.

1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().

There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition.  I've marked it as such so that we remember to fix it.

This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5ed6399

D
[IPV6]: Fix leak added by udp connect dst caching fix. · a5e7c210
由 David S. Miller 提交于 10月 03, 2005
```
Based upon a patch from Mitsuru KANDA <mk@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
a5e7c210
Y
[IPV6]: Fix ipv6 fragment ID selection at slow path · f36d6ab1
由 Yan Zheng 提交于 10月 03, 2005
```
Signed-Off-By: NYan Zheng <yanzheng@21cn.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f36d6ab1

[IPV4]: Fix "Proxy ARP seems broken" · 444fc8fc

由 Herbert Xu 提交于 10月 03, 2005

Meelis Roos <mroos@linux.ee> wrote:
> RK> My firewall setup relies on proxyarp working.  However, with 2.6.14-rc3,
> RK> it appears to be completely broken.  The firewall is 212.18.232.186,
> 
> Same here with some kernel between 14-rc2 and 14-rc3 - no reposnse to
> ARP on a proxyarp gateway. Sorry, no exact revison and no more debugging
> yet since it'a a production gateway.

The breakage is caused by the change to use the CB area for flagging
whether a packet has been queued due to proxy_delay.  This area gets
cleared every time arp_rcv gets called.  Unfortunately packets delayed
due to proxy_delay also go through arp_rcv when they are reprocessed.

In fact, I can't think of a reason why delayed proxy packets should go
through netfilter again at all.  So the easiest solution is to bypass
that and go straight to arp_process.

This is essentially what would've happened before netfilter support
was added to ARP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

444fc8fc

[NET]: Fix "sysctl_net.c:36: error: 'core_table' undeclared here" · 496a22b0

由 Russell King 提交于 10月 03, 2005

During the build for ARM machine type "fortunet", this error occurred:

  CC      net/sysctl_net.o
net/sysctl_net.c:36: error: 'core_table' undeclared here (not in a function)

It appears that the following configuration settings cause this error
due to a missing include:
CONFIG_SYSCTL=y
CONFIG_NET=y
# CONFIG_INET is not set

core_table appears to be declared in net/sock.h.  if CONFIG_INET were
defined, net/sock.h would have been included via:
  sysctl_net.c -> net/ip.h -> linux/ip.h -> net/sock.h

so include it directly.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

496a22b0

[INET]: speedup inet (tcp/dccp) lookups · 81c3d547

由 Eric Dumazet 提交于 10月 03, 2005

Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &head->chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))
     ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
     ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))                 &&  \
     (inet_sk(__sk)->daddr          == (__saddr))   &&  \
     (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))


- Adds a prefetch(head->chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&head->lock);
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Acked-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81c3d547

[NET]: Fix packet timestamping. · 325ed823

由 Herbert Xu 提交于 10月 03, 2005

I've found the problem in general.  It affects any 64-bit
architecture.  The problem occurs when you change the system time.

Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base.  You then wind the clock back
by a day.  From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.

In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.

When we do overflow, we'll need a better solution of course.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

325ed823

30 9月, 2005 4 次提交

[ATM]: [lec] reset retry counter when new arp issued · 75b895c1

由 Scott Talbert 提交于 9月 29, 2005

From: Scott Talbert <scott.talbert@lmco.com>
Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

75b895c1

[ATM]: [lec] attempt to support cisco failover · 4a7097fc

由 Scott Talbert 提交于 9月 29, 2005

From: Scott Talbert <scott.talbert@lmco.com>
Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a7097fc

[TCP]: Don't over-clamp window in tcp_clamp_window() · 09e9ec87

由 Alexey Kuznetsov 提交于 9月 29, 2005

From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>

Handle better the case where the sender sends full sized
frames initially, then moves to a mode where it trickles
out small amounts of data at a time.

This known problem is even mentioned in the comments
above tcp_grow_window() in tcp_input.c, specifically:

...
 * The scheme does not work when sender sends good segments opening
 * window and then starts to feed us spagetti. But it should work
 * in common situations. Otherwise, we have to rely on queue collapsing.
...

When the sender gives full sized frames, the "struct sk_buff" overhead
from each packet is small.  So we'll advertize a larger window.
If the sender moves to a mode where small segments are sent, this
ratio becomes tilted to the other extreme and we start overrunning
the socket buffer space.

tcp_clamp_window() tries to address this, but it's clamping of
tp->window_clamp is a wee bit too aggressive for this particular case.

Fix confirmed by Ion Badulescu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09e9ec87

[TCP]: Revert · 01ff367e

由 David S. Miller 提交于 9月 29, 2005

But retain the comment fix.

Alexey Kuznetsov has explained the situation as follows:

--------------------

I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.

See?

So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:

	init_rcv_wnd(mss)=
	  min { init_cwnd(mss1)*mss1 for mss1 <= mss }

It is something sort of:

	if (mss < 1096)
		return mss*4;
	if (mss < 1096*2)
		return 1096*4;
	return mss*2;

(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)

I selected it differently giving more window than it is strictly
required.  Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst.  But besides that it is arbitrary, so I decided to give
slack space of one segment.

Actually, the logic was:

If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.

For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.

If mss is huge, force 2*mss. No slack space.

Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.

--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01ff367e

29 9月, 2005 6 次提交

[PATCH] proc_mkdir() should be used to create procfs directories · 66600221

由 Al Viro 提交于 9月 28, 2005

A bunch of create_proc_dir_entry() calls creating directories had crept
in since the last sweep; converted to proc_mkdir().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

66600221

[NET]: Fix reversed logic in eth_type_trans(). · 01d40f28

由 David S. Miller 提交于 9月 28, 2005

I got the second compare_eth_addr() test reversed, oops.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01d40f28

[ATM]: fix bug in atm address list handling · 735631a9

由 Martin Whitaker 提交于 9月 28, 2005

From: Martin Whitaker <atm@martin-whitaker.co.uk>
Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil>

735631a9

C
[ATM]: track and close listen sockets when sigd exits · 9301e320
由 Chas Williams 提交于 9月 28, 2005
```
Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil>
```
9301e320
R
[ATM]: net/atm/ioctl.c: autoload pppoatm and br2684 · e2c4b721
由 Roman Kagan 提交于 9月 28, 2005
```
Signed-off-by: NRoman Kagan <rkagan@mail.ru>
Signed-off-by: NChas Williams <chas@cmf.nrl.navy.mil>
```
e2c4b721

[TCP]: Fix init_cwnd calculations in tcp_select_initial_window() · 6b251858

由 David S. Miller 提交于 9月 28, 2005

Match it up to what RFC2414 really specifies.
Noticed by Rick Jones.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b251858

28 9月, 2005 10 次提交

[APPLETALK]: Fix broadcast bug. · 64233bff

由 Oliver Dawid 提交于 9月 27, 2005

From: Oliver Dawid <oliver@helios.de>

we found a bug in net/appletalk/ddp.c concerning broadcast packets. In 
kernel 2.4 it was working fine. The bug first occured 4 years ago when 
switching to new SNAP layer handling. This bug can be splitted up into a 
sending(1) and reception(2) problem:

Sending(1)
In kernel 2.4 broadcast packets were sent to a matching ethernet device 
and atalk_rcv() was called to receive it as "loopback" (so loopback 
packets were shortcutted and handled in DDP layer).

When switching to the new SNAP structure, this shortcut was removed and 
the loopback packet was send to SNAP layer. The author forgot to replace 
the remote device pointer by the loopback device pointer before sending 
the packet to SNAP layer (by calling ddp_dl->request() ) therfor the 
packet was not sent back by underlying layers to ddp's atalk_rcv().

Reception(2)
In atalk_rcv() a packet received by this loopback mechanism contains now 
the (rigth) loopback device pointer (in Kernel 2.4 it was the (wrong) 
remote ethernet device pointer) and therefor no matching socket will be 
found to deliver this packet to. Because a broadcast packet should be 
send to the first matching socket (as it is done in many other protocols 
(?)), we removed the network comparison in broadcast case.

Below you will find a patch to correct this bug. Its diffed to kernel 
2.6.14-rc1
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64233bff

[NET]: Slightly optimize ethernet address comparison. · ba645c16

由 David S. Miller 提交于 9月 27, 2005

We know the thing is at least 2-byte aligned, so take
advantage of that instead of invoking memcmp() which
results in truly horrifically inefficient code because
it can't assume anything about alignment.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba645c16

[ROSE]: fix typo (regeistration) · 520d1b83

由 Alexey Dobriyan 提交于 9月 27, 2005

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

520d1b83

[ROSE]: check rose_ndevs earlier · a83cd2cc

由 Alexey Dobriyan 提交于 9月 27, 2005

* Don't bother with proto registering if rose_ndevs is bad.
* Make escape structure more coherent.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a83cd2cc

[ROSE]: return sane -E* from rose_proto_init() · 70ff3b66

由 Alexey Dobriyan 提交于 9月 27, 2005

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70ff3b66

[ROSE]: do proto_unregister() on exit paths · c3c4ed65

由 Alexey Dobriyan 提交于 9月 27, 2005

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3c4ed65

[NET]: Fix module reference counts for loadable protocol modules · a79af59e

由 Frank Filz 提交于 9月 27, 2005

I have been experimenting with loadable protocol modules, and ran into
several issues with module reference counting.

The first issue was that __module_get failed at the BUG_ON check at
the top of the routine (checking that my module reference count was
not zero) when I created the first socket. When sk_alloc() is called,
my module reference count was still 0. When I looked at why sctp
didn't have this problem, I discovered that sctp creates a control
socket during module init (when the module ref count is not 0), which
keeps the reference count non-zero. This section has been updated to
address the point Stephen raised about checking the return value of
try_module_get().

The next problem arose when my socket init routine returned an error.
This resulted in my module reference count being decremented below 0.
My socket ops->release routine was also being called. The issue here
is that sock_release() calls the ops->release routine and decrements
the ref count if sock->ops is not NULL. Since the socket probably
didn't get correctly initialized, this should not be done, so we will
set sock->ops to NULL because we will not call try_module_get().

While searching for another bug, I also noticed that sys_accept() has
a possibility of doing a module_put() when it did not do an
__module_get so I re-ordered the call to security_socket_accept().
Signed-off-by: NFrank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a79af59e

[NET]: Prefetch dev->qdisc_lock in dev_queue_xmit() · 2d7ceece

由 Eric Dumazet 提交于 9月 27, 2005

We know the lock is going to be taken.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d7ceece

[NET]: Use non-recursive algorithm in skb_copy_datagram_iovec() · bc8dfcb9

由 Daniel Phillips 提交于 9月 27, 2005

Use iteration instead of recursion.  Fraglists within fraglists
should never occur, so we BUG check this.
Signed-off-by: NDaniel Phillips <phillips@istop.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc8dfcb9

[NEIGH]: Add debugging check when adding timers. · 667347f1

由 David S. Miller 提交于 9月 27, 2005

If we double-add a neighbour entry timer, which should be
impossible but has been reported, dump the current state of
the entry so that we can debug this.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

667347f1

27 9月, 2005 1 次提交

[NETFILTER]: Fix invalid module autoloading by splitting iptable_nat · 188bab3a

由 Harald Welte 提交于 9月 26, 2005

When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.

This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko).  Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend.  ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.

This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

188bab3a

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功