提交 · 89cee8b1cbb9dac40c92ef1968aea2b45f82fd18 · openanolis / cloud-kernel

04 1月, 2006 5 次提交

由 Herbert Xu 提交于 12月 13, 2005

Another spin of Herbert Xu's "safer ip reassembly" patch
for 2.6.16.

(The original patch is here:
http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
and my only contribution is to have tested it.)

This patch (optionally) does additional checks before accepting IP
fragments, which can greatly reduce the possibility of reassembling
fragments which originated from different IP datagrams.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NArthur Kepner <akepner@sgi.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89cee8b1

[NETFILTER] ip_tables: NUMA-aware allocation · 31836064

由 Eric Dumazet 提交于 12月 13, 2005

Part of a performance problem with ip_tables is that memory allocation
is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
separate cache lines)

Even with small iptables rules, the cost of this misplacement can be
high on common workloads.  Instead of using one vmalloc() area
(located in the node of the iptables process), we now allocate an area
for each possible CPU, using vmalloc_node() so that memory should be
allocated in the CPU's node if possible.

Port to arp_tables and ip6_tables by Harald Welte.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31836064

[TCP] BIC: CUBIC window growth (2.0) · df3271f3

由 Stephen Hemminger 提交于 12月 13, 2005

Replace existing BIC version 1.1 with new version 2.0.
The main change is to replace the window growth function
with a cubic function as described in:
http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdfSigned-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df3271f3

[TCP] BIC: spelling and whitespace · 05d05450

由 Stephen Hemminger 提交于 12月 13, 2005

Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05d05450

[TCP] BIC: remove low utilization code. · 018da8f4

由 Stephen Hemminger 提交于 12月 13, 2005

The latest BICTCP patch at:
http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm

disables the low_utilization feature of BICTCP because it doesn't work
in some cases. This patch removes it.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

018da8f4

20 12月, 2005 2 次提交

[XFRM]: Handle DCCP in xfrm{4,6}_decode_session · 9e999993

由 Patrick McHardy 提交于 12月 19, 2005

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e999993

[NETFILTER]: Fix NAT init order · 0476f171

由 Patrick McHardy 提交于 12月 19, 2005

As noticed by Phil Oester, the GRE NAT protocol helper is initialized
before the NAT core, which makes registration fail.

Change the linking order to make NAT be initialized first.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0476f171

15 12月, 2005 1 次提交

[GRE]: Fix hardware checksum modification · 1542272a

由 Herbert Xu 提交于 12月 14, 2005

The skb_postpull_rcsum introduced a bug to the checksum modification.
Although the length pulled is offset bytes, the origin of the pulling
is the GRE header, not the IP header.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1542272a

13 12月, 2005 1 次提交

[NETFILTER]: ip_nat_tftp: Fix expectation NAT · 2f9616d4

由 Marcus Sundberg 提交于 12月 12, 2005

When a TFTP client is SNATed so that the port is also changed, the
port is never changed back for the expected connection.
Signed-off-by: NMarcus Sundberg <marcus@ingate.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f9616d4

07 12月, 2005 3 次提交

[TCP] Vegas: timestamp before clone · dfb4b9dc

由 David S. Miller 提交于 12月 06, 2005

We have to store the congestion control timestamp on the SKB before we
clone it, not after.  Else we get no timestamping information at all.

tcp_transmit_skb() has been reworked so that we can do the timestamp
still in one spot, instead of at all the call sites.

Problem discovered, and initial fix, from Tom Young
<tyo@ee.unimelb.edu.au>.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfb4b9dc

[TCP] Vegas: Remove extra call to tcp_vegas_rtt_calc · 0d7bef60

由 Thomas Young 提交于 12月 06, 2005

Remove unneeded call to tcp_vegas_rtt_calc. The more accurate
microsecond value has already been registered prior to calling
tcp_vegas_cong_avoid.
Signed-off-by: NThomas Young <tyo@ee.mu.oz.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d7bef60

[TCP] Vegas: stop resetting rtt every ack · 5b495613

由 Thomas Young 提交于 12月 06, 2005

Move the resetting of rtt measurements to inside the once per RTT
block of code.
Signed-off-by: NThomas Young <tyo@ee.mu.oz.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b495613

06 12月, 2005 6 次提交

P
[NETFILTER]: Don't use conntrack entry after dropping the reference · 2fdf1faa
由 Patrick McHardy 提交于 12月 05, 2005
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
2fdf1faa

[NETFILTER]: Fix unbalanced read_unlock_bh in ctnetlink · 266c8543

由 Patrick McHardy 提交于 12月 05, 2005

NFA_NEST calls NFA_PUT which jumps to nfattr_failure if the skb has no
room left. We call read_unlock_bh at nfattr_failure for the NFA_PUT inside
the locked section, so move NFA_NEST inside the locked section too.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

266c8543

[NETFILTER]: Mark ctnetlink as EXPERIMENTAL · a7957563

由 Patrick McHardy 提交于 12月 05, 2005

Should have been marked EXPERIMENTAL from the beginning, as the current
bunch of fixes show.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7957563

[NETFILTER]: Fix CTA_PROTO_NUM attribute size in ctnetlink · 0be7fa92

由 Patrick McHardy 提交于 12月 05, 2005

CTA_PROTO_NUM is a u_int8_t.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0be7fa92

[NETFILTER]: Fix ip_conntrack_flush abuse in ctnetlink · afe5c6bb

由 Patrick McHardy 提交于 12月 05, 2005

ip_conntrack_flush() used to be part of ip_conntrack_cleanup(), which needs
to drop _all_ references on module unload. Table flushed using ctnetlink
just needs to clean the table and doesn't need to flush the event cache or
wait for any references attached to skbs. Move everything but pure table
flushing back to ip_conntrack_cleanup().
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afe5c6bb

[NETFILTER]: Fix incorrect argument to ip_nat_initialized() in ctnetlink · 8d1ca699

由 Pablo Neira Ayuso 提交于 12月 05, 2005

ip_nat_initialized() takes enum ip_nat_manip_type as it's second argument,
not a hook number.

Noticed and initial patch by Marcus Sundberg <marcus@ingate.com>.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d1ca699

03 12月, 2005 2 次提交

[IPV4] Fix EPROTONOSUPPORT error in inet_create · 86c8f9d1

由 Herbert Xu 提交于 12月 02, 2005

There is a coding error in inet_create that causes it to always return
ESOCKTNOSUPPORT.  It should return EPROTONOSUPPORT when there are
protocols registered for a given socket type but none of them match
the requested protocol.

This is based on a patch by Jayachandran C.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86c8f9d1

[IGMP]: workaround for IGMP v1/v2 bug · 24c69275

由 David Stevens 提交于 12月 02, 2005

From: David Stevens <dlstevens@us.ibm.com>

As explained at:

	http://www.cs.ucsb.edu/~krishna/igmp_dos/

With IGMP version 1 and 2 it is possible to inject a unicast
report to a client which will make it ignore multicast
reports sent later by the router.

The fix is to only accept the report if is was sent to a
multicast or unicast address.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24c69275

02 12月, 2005 3 次提交

[NETLINK]: Fix processing of fib_lookup netlink messages · ea86575e

由 Thomas Graf 提交于 12月 01, 2005

The receive path for fib_lookup netlink messages is lacking sanity
checks for header and payload and is thus vulnerable to malformed
netlink messages causing illegal memory references.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea86575e

[NETFILTER]: Fix recent match jiffies wrap mismatches · 2a43c4af

由 Phil Oester 提交于 12月 01, 2005

Around jiffies wrap time (i.e. within first 5 mins after boot), recent
match rules which contain both --seconds and --hitcount arguments
experience false matches.

This is because the last_pkts array is filled with zeros on creation, and
when comparing 'now' to 0 (+ --seconds argument), time_before_eq thinks it
has found a hit.

Below patch adds a break if the packet value is zero.  This has the
unfortunate side effect of causing mismatches if a packet was received
when jiffies really was equal to zero.  The odds of that happening are
slim compared to the problems caused by not adding the break however.
Plus, the author used this same method just below, so it is "good enough".

This fixes netfilter bugs #383 and #395.
Signed-off-by: NPhil Oester <kernel@linuxace.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a43c4af

[NETFILTER]: Ignore ACKs ACKs on half open connections in TCP conntrack · 73f30602

由 Jozsef Kadlecsik 提交于 12月 01, 2005

Mounting NFS file systems after a (warm) reboot could take a long time if
firewalling and connection tracking was enabled.

The reason is that the NFS clients tends to use the same ports (800 and
counting down). Now on reboot, the server would still have a TCB for an
existing TCP connection client:800 -> server:2049. The client sends a
SYN from port 800 to server:2049, which elicits an ACK from the server.
The firewall on the client drops the ACK because (from its point of
view) the connection is still in half-open state, and it expects to see
a SYNACK.

The client will eventually time out after several minutes.

The following patch corrects this, by accepting ACKs on half open
connections as well.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73f30602

30 11月, 2005 4 次提交

[NETFILTER] ipv4: small cleanups · d127e94a

由 Adrian Bunk 提交于 11月 29, 2005

This patch contains the following cleanups:
- make needlessly global code static
- ip_conntrack_core.c: ip_conntrack_flush() -> ip_conntrack_flush(void)
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d127e94a

[IPV4]: make two functions static · 4b30b1c6

由 Adrian Bunk 提交于 11月 29, 2005

This patch makes two needlessly global functions static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b30b1c6

[NET]: Add const markers to various variables. · 9b5b5cff

由 Arjan van de Ven 提交于 11月 29, 2005

the patch below marks various variables const in net/; the goal is to
move them to the .rodata section so that they can't false-share
cachelines with things that get written to, as well as potentially
helping gcc a bit with optimisations.  (these were found using a gcc
patch to warn about such variables)
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b5b5cff

[IPV4] tcp/route: Another look at hash table sizes · 18955cfc

由 Mike Stroyan 提交于 11月 29, 2005

  The tcp_ehash hash table gets too big on systems with really big memory.
It is worse on systems with pages larger than 4KB.  It wastes memory that
could be better used.  It also makes the netstat command slow because reading
/proc/net/tcp and /proc/net/tcp6 needs to go through the full hash table.

  The default value should not be larger for larger page sizes.  It seems
that the effect of page size is an unintended error dating back a long
time.  I also wonder if the default value really should be a larger
fraction of memory for systems with more memory.  While systems with
really big ram can afford more space for hash tables, it is not clear to
me that they benefit from increasing the allocation ratio for this table.

  The amount of memory allocated is determined by net/ipv4/tcp.c:tcp_init and
mm/page_alloc.c:alloc_large_system_hash.

tcp_init calls alloc_large_system_hash passing parameters-
    bucketsize=sizeof(struct tcp_ehash_bucket)
    numentries=thash_entries
    scale=(num_physpages >= 128 * 1024) ? (25-PAGE_SHIFT) : (27-PAGE_SHIFT)
    limit=0

On i386, PAGE_SHIFT is 12 for a page size of 4K
On ia64, PAGE_SHIFT defaults to 14 for a page size of 16K

The num_physpages test above makes the allocation take a larger fraction
of the total memory on systems with larger memory.  The threshold size
for a i386 system is 512MB.  For an ia64 system with 16KB pages the
threshold is 2GB.

For smaller memory systems-
On i386, scale = (27 - 12) = 15
On ia64, scale = (27 - 14) = 13
For larger memory systems-
On i386, scale = (25 - 12) = 13
On ia64, scale = (25 - 14) = 11

  For the rest of this discussion, I'll just track the larger memory case.

  The default behavior has numentries=thash_entries=0, so the allocated
size is determined by either scale or by the default limit of 1/16 of
total memory.

In alloc_large_system_hash-
|	numentries = (flags & HASH_HIGHMEM) ? nr_all_pages : nr_kernel_pages;
|	numentries += (1UL << (20 - PAGE_SHIFT)) - 1;
|	numentries >>= 20 - PAGE_SHIFT;
|	numentries <<= 20 - PAGE_SHIFT;

  At this point, numentries is pages for all of memory, rounded up to the
nearest megabyte boundary.

|	/* limit to 1 bucket per 2^scale bytes of low memory */
|	if (scale > PAGE_SHIFT)
|		numentries >>= (scale - PAGE_SHIFT);
|	else
|		numentries <<= (PAGE_SHIFT - scale);

On i386, numentries >>= (13 - 12), so numentries is 1/8196 of
bytes of total memory.
On ia64, numentries <<= (14 - 11), so numentries is 1/2048 of
bytes of total memory.

|        log2qty = long_log2(numentries);
|
|        do {
|                size = bucketsize << log2qty;

bucketsize is 16, so size is 16 times numentries, rounded
down to a power of two.

On i386, size is 1/512 of bytes of total memory.
On ia64, size is 1/128 of bytes of total memory.

For smaller systems the results are
On i386, size is 1/2048 of bytes of total memory.
On ia64, size is 1/512 of bytes of total memory.

  The large page effect can be removed by just replacing
the use of PAGE_SHIFT with a constant of 12 in the calls to
alloc_large_system_hash.  That makes them more like the other uses of
that function from fs/inode.c and fs/dcache.c
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18955cfc

24 11月, 2005 1 次提交

[NETFILTER]: ip_conntrack_netlink.c needs linux/interrupt.h · de919820

由 Benoit Boissinot 提交于 11月 23, 2005

net/ipv4/netfilter/ip_conntrack_netlink.c: In function 'ctnetlink_dump_table':
net/ipv4/netfilter/ip_conntrack_netlink.c:409: warning: implicit declaration of function 'local_bh_disable'
net/ipv4/netfilter/ip_conntrack_netlink.c:427: warning: implicit declaration of function 'local_bh_enable'
Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de919820

23 11月, 2005 2 次提交

[NETFILTER] ctnetlink: Fix refcount leak ip_conntrack/nat_proto · 00cb277a

由 Pablo Neira Ayuso 提交于 11月 22, 2005

Remove proto == NULL checking since ip_conntrack_[nat_]proto_find_get
always returns a valid pointer.

Fix missing ip_conntrack_proto_put in some paths.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00cb277a

[IPV4]: Fix secondary IP addresses after promotion · 0ff60a45

由 Jamal Hadi Salim 提交于 11月 22, 2005

This patch fixes the problem with promoting aliases when:
a) a single primary and > 1 secondary addresses
b) multiple primary addresses each with at least one secondary address

Based on earlier efforts from Brian Pomerantz <bapper@piratehaven.org>,
Patrick McHardy <kaber@trash.net> and Thomas Graf <tgraf@suug.ch>
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ff60a45

21 11月, 2005 2 次提交

[NETFILTER]: fixed dependencies between modules related with ip_conntrack · 2b8f2ff6

由 Yasuyuki Kozakai 提交于 11月 20, 2005

- IP_NF_CONNTRACK_MARK is bool and depends on only IP_NF_CONNTRACK
  which is tristate. If a variable depends on IP_NF_CONNTRACK_MARK and
  doesn't care about IP_NF_CONNTRACK, it can be y. This must be avoided.
- IP_NF_CT_ACCT has same problem.
- IP_NF_TARGET_CLUSTERIP also depends on IP_NF_MANGLE.
Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8f2ff6

[FIB_TRIE]: Don't show local table in /proc/net/route output · c9e53cbe

由 Patrick McHardy 提交于 11月 20, 2005

Don't show local table to behave similar to fib_hash.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9e53cbe

18 11月, 2005 2 次提交

[NETFILTER] ip_conntrack: fix ftp/irc/tftp helpers on ports >= 32768 · 2fce76af

由 Harald Welte 提交于 11月 17, 2005

Since we've converted the ftp/irc/tftp helpers to use the new
module_parm_array() some time ago, we ware accidentially using signed data
types - thus preventing those modules from being used on ports >= 32768.

This patch fixes it by using 'ushort' module parameters.

Thanks to Jan Nijs for reporting this bug.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2fce76af

[TCP]: TCP highspeed build error · bd6af700

由 Stephen Hemminger 提交于 11月 17, 2005

There is a compile error that crept in with the last patch of
TCP patches.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd6af700

17 11月, 2005 1 次提交

[IPV4,IPV6]: replace handmade list with hlist in IPv{4,6} reassembly · e7c8a41e

由 Yasuyuki Kozakai 提交于 11月 16, 2005

Both of ipq and frag_queue have *next and **prev, and they can be replaced
with hlist. Thanks Arnaldo Carvalho de Melo for the suggestion.
Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7c8a41e

16 11月, 2005 1 次提交

[TCP]: More spelling fixes. · 31f34269

由 Stephen Hemminger 提交于 11月 15, 2005

From Joe Perches
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31f34269

15 11月, 2005 3 次提交

[NETFILTER] nfnetlink: unconditionally require CAP_NET_ADMIN · 37d2e7a2

由 Harald Welte 提交于 11月 14, 2005

This patch unconditionally requires CAP_NET_ADMIN for all nfnetlink
messages. It also removes the per-message cap_required field, since all
existing subsystems use CAP_NET_ADMIN for all their messages anyway.

Patrick McHardy owes me a beer if we ever need to re-introduce this.
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37d2e7a2

[NETFILTER] ctnetlink: More thorough size checking of attributes · 56558208

由 Pablo Neira Ayuso 提交于 11月 14, 2005

Add missing size checks. Thanks Patrick McHardy for the hint.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56558208

[NETFILTER] ctnetlink: use size_t to make gcc-4.x happy · dbd36ea4

由 Pablo Neira Ayuso 提交于 11月 14, 2005

Make gcc-4.x happy. Use size_t instead of int. Thanks to Patrick McHardy
for the hint.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbd36ea4

13 11月, 2005 1 次提交

[NETFILTER] {ip,nf}_conntrack TCP: Accept SYN+PUSH like SYN · a2d7222f

由 Vlad Drukker 提交于 11月 12, 2005

Some devices (e.g. Qlogic iSCSI HBA hardware like QLA4010 up to firmware
3.0.0.4) initiates TCP with SYN and PUSH flags set.

The Linux TCP/IP stack deals fine with that, but the connection tracking
code doesn't.

This patch alters TCP connection tracking to accept SYN+PUSH as a valid
flag combination.
Signed-off-by: NVlad Drukker <vlad@storewiz.com>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2d7222f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功