提交 · 6a438bbe68c7013a42d9c5aee5a40d7dafdbe6ec · openanolis / cloud-kernel

11 11月, 2005 6 次提交

[TCP]: speed up SACK processing · 6a438bbe

由 Stephen Hemminger 提交于 11月 10, 2005

Use "hints" to speed up the SACK processing. Various forms 
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a438bbe

[TCP]: spelling fixes · caa20d9a

由 Stephen Hemminger 提交于 11月 10, 2005

Minor spelling fixes for TCP code.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caa20d9a

[TCP]: Appropriate Byte Count support · 9772efb9

由 Stephen Hemminger 提交于 11月 10, 2005

This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.

The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9772efb9

[TCP]: add tcp_slow_start helper · 7faffa1c

由 Stephen Hemminger 提交于 11月 10, 2005

Move all the code that does linear TCP slowstart to one
inline function to ease later patch to add ABC support.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7faffa1c

[TCP]: fix congestion window update when using TSO deferal · f4805ede

由 Stephen Hemminger 提交于 11月 10, 2005

TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.

The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until 
we can send a MSS chunk. But, we also don't update the congestion window 
unless we have filled, as per RFC2861.

This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4805ede

[NET]: Detect hardware rx checksum faults correctly · fb286bb2

由 Herbert Xu 提交于 11月 10, 2005

Here is the patch that introduces the generic skb_checksum_complete
which also checks for hardware RX checksum faults.  If that happens,
it'll call netdev_rx_csum_fault which currently prints out a stack
trace with the device name.  In future it can turn off RX checksum.

I've converted every spot under net/ that does RX checksum checks to
use skb_checksum_complete or __skb_checksum_complete with the
exceptions of:

* Those places where checksums are done bit by bit.  These will call
netdev_rx_csum_fault directly.

* The following have not been completely checked/converted:

ipmr
ip_vs
netfilter
dccp

This patch is based on patches and suggestions from Stephen Hemminger
and David S. Miller.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb286bb2

10 11月, 2005 4 次提交

[NETLINK]: Generic netlink family · 482a8524

由 Thomas Graf 提交于 11月 10, 2005

The generic netlink family builds on top of netlink and provides
simplifies access for the less demanding netlink users. It solves
the problem of protocol numbers running out by introducing a so
called controller taking care of id management and name resolving.

Generic netlink modules register themself after filling out their
id card (struct genl_family), after successful registration the
modules are able to register callbacks to command numbers by
filling out a struct genl_ops and calling genl_register_op(). The
registered callbacks are invoked with attributes parsed making
life of simple modules a lot easier.

Although generic netlink modules can request static identifiers,
it is recommended to use GENL_ID_GENERATE and to let the controller
assign a unique identifier to the module. Userspace applications
will then ask the controller and lookup the idenfier by the module
name.

Due to the current multicast implementation of netlink, the number
of generic netlink modules is restricted to 1024 to avoid wasting
memory for the per socket multiacst subscription bitmask.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

482a8524

[NETLINK]: Generic netlink receive queue processor · 82ace47a

由 Thomas Graf 提交于 11月 10, 2005

Introduces netlink_run_queue() to handle the receive queue of
a netlink socket in a generic way. Processes as much as there
was in the queue upon entry and invokes a callback function
for each netlink message found. The callback function may
refuse a message by returning a negative error code but setting
the error pointer to 0 in which case netlink_run_queue() will
return with a qlen != 0.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82ace47a

[NETLINK]: Type-safe netlink messages/attributes interface · bfa83a9e

由 Thomas Graf 提交于 11月 10, 2005

Introduces a new type-safe interface for netlink message and
attributes handling. The interface is fully binary compatible
with the old interface towards userspace. Besides type safety,
this interface features attribute validation capabilities,
simplified message contstruction, and documentation.

The resulting netlink code should be smaller, less error prone
and easier to understand.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfa83a9e

[NETFILTER]: Add nf_conntrack subsystem. · 9fb9cbb1

由 Yasuyuki Kozakai 提交于 11月 09, 2005

The existing connection tracking subsystem in netfilter can only
handle ipv4.  There were basically two choices present to add
connection tracking support for ipv6.  We could either duplicate all
of the ipv4 connection tracking code into an ipv6 counterpart, or (the
choice taken by these patches) we could design a generic layer that
could handle both ipv4 and ipv6 and thus requiring only one sub-protocol
(TCP, UDP, etc.) connection tracking helper module to be written.

In fact nf_conntrack is capable of working with any layer 3
protocol.

The existing ipv4 specific conntrack code could also not deal
with the pecularities of doing connection tracking on ipv6,
which is also cured here.  For example, these issues include:

1) ICMPv6 handling, which is used for neighbour discovery in
   ipv6 thus some messages such as these should not participate
   in connection tracking since effectively they are like ARP
   messages

2) fragmentation must be handled differently in ipv6, because
   the simplistic "defrag, connection track and NAT, refrag"
   (which the existing ipv4 connection tracking does) approach simply
   isn't feasible in ipv6

3) ipv6 extension header parsing must occur at the correct spots
   before and after connection tracking decisions, and there were
   no provisions for this in the existing connection tracking
   design

4) ipv6 has no need for stateful NAT

The ipv4 specific conntrack layer is kept around, until all of
the ipv4 specific conntrack helpers are ported over to nf_conntrack
and it is feature complete.  Once that occurs, the old conntrack
stuff will get placed into the feature-removal-schedule and we will
fully kill it off 6 months later.
Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: NHarald Welte <laforge@netfilter.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

9fb9cbb1

09 11月, 2005 6 次提交

C

ieee80211: cleanup crypto list handling, other minor cleanups. · e3305626
由 Christoph Hellwig 提交于 11月 09, 2005

e3305626

[Bluetooth]: Remove the usage of /proc completely · be9d1227

由 Marcel Holtmann 提交于 11月 08, 2005

This patch removes all relics of the /proc usage from the Bluetooth
subsystem core and its upper layers. All the previous information are
now available via /sys/class/bluetooth through appropriate functions.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be9d1227

[Bluetooth]: Add endian annotations to the core · 1ebb9252

由 Marcel Holtmann 提交于 11月 08, 2005

This patch adds the endian annotations to the Bluetooth core.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ebb9252

[NET]: sk_add_backlog convert from macro to inline · 9ee6b535

由 Stephen Hemminger 提交于 11月 08, 2005

There is no reason for sk_add_backlog to be a macro. It can
just be an inline function and get type checking.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ee6b535

Y
[IPV6]: Make ipv6_addr_type() more generic so that we can use it for source address selection. · b1cacb68
由 YOSHIFUJI Hideaki 提交于 11月 08, 2005
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b1cacb68
Y
[IPV6]: Put addr_diff() into common header for future use. · 971f359d
由 YOSHIFUJI Hideaki 提交于 11月 08, 2005
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
971f359d

08 11月, 2005 1 次提交
- J
  
  Update version ieee80211 stamp to 1.1.7 · d7e02edb
  由 James Ketrenos 提交于 10月 24, 2005
  
  d7e02edb
06 11月, 2005 3 次提交

[TCP/DCCP]: Randomize port selection · 6df71634

由 Stephen Hemminger 提交于 11月 03, 2005

This patch randomizes the port selected on bind() for connections
to help with possible security attacks. It should also be faster
in most cases because there is no need for a global lock.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

6df71634

[NET]: Introduce INET_ECN_set_ce() function · 2566a509

由 Thomas Graf 提交于 11月 05, 2005

Changes IP_ECN_set_ce() and IP6_ECN_set_ce() to return 0 if the CE
bits could not bet set because none of the ECT bits are set or 1
if the CE bits are already set or have been successfully set.

Introduces INET_ECN_set_ce(skb) to enable CE bits for all supported
protocols.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

2566a509

[PKT_SCHED]: Generic RED layer · a7834745

由 Thomas Graf 提交于 11月 05, 2005

Extracts the RED algorithm from sch_red.c and puts it into include/net/red.h
for use by other RED based modules. The statistics are extended to be more
fine grained in order to differ between probability/forced marks/drops.
We now reset the average queue length when setting new parameters, leaving
it might result in an unreasonable qavg for a while depending on the value of W.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

a7834745

29 10月, 2005 4 次提交

[SCTP] Rename SCTP specific control message flags. · eaa5c54d

由 Ivan Skytte Jorgensen 提交于 10月 28, 2005

Rename SCTP specific control message flags to use SCTP_ prefix rather than
MSG_ prefix as per the latest sctp sockets API draft.
Signed-off-by: NIvan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>

eaa5c54d

J

drivers/net: Remove pointless checks for NULL prior to calling kfree() · b4558ea9
由 Jesper Juhl 提交于 10月 28, 2005

b4558ea9

[Bluetooth] Make more functions static · 6516455d

由 Marcel Holtmann 提交于 10月 28, 2005

This patch makes another bunch of functions static.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

6516455d

[Bluetooth] Move CRC table into RFCOMM core · 408c1ce2

由 Marcel Holtmann 提交于 10月 28, 2005

This patch moves rfcomm_crc_table[] into the RFCOMM core, because there
is no need to keep it in a separate file.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

408c1ce2

28 10月, 2005 1 次提交

[PATCH] gfp_t: net/* · 7d877f3b

由 Al Viro 提交于 10月 21, 2005

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7d877f3b

26 10月, 2005 2 次提交

[IPSEC]: Kill obsolete get_mss function · 80b30c10

由 Herbert Xu 提交于 10月 15, 2005

Now that we've switched over to storing MTUs in the xfrm_dst entries,
we no longer need the dst's get_mss methods.  This patch gets rid of
them.

It also documents the fact that our MTU calculation is not optimal
for ESP.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

80b30c10

[LLC]: Strip RIF flag from source MAC address · 5ed688a7

由 Jochen Friedrich 提交于 10月 23, 2005

Signed-off-by: NJochen Friedrich <jochen@scram.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

5ed688a7

23 10月, 2005 1 次提交

[AX.25]: Fix signed char bug · 4595f251

由 Ralf Baechle 提交于 10月 14, 2005

On architectures where the char type defaults to unsigned some of the
arithmetic in the AX.25 stack to fail, resulting in some packets being dropped
on receive.

Credits for tracking this down and the original patch to
Bob Brose N0QBJ <linuxhams@n0qbj-11.ampr.org>.
Signed-off-by: NRalf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>

4595f251

22 10月, 2005 1 次提交
- J
  
  Update version ieee80211 stamp to 1.1.6 · 519a62bb
  由 James Ketrenos 提交于 10月 20, 2005
  
  519a62bb
11 10月, 2005 1 次提交

[TWSK]: Grab the module refcount for timewait sockets · eeb2b856

由 Arnaldo Carvalho de Melo 提交于 10月 10, 2005

This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eeb2b856

09 10月, 2005 1 次提交

[PATCH] gfp flags annotations - part 1 · dd0fc66f

由 Al Viro 提交于 10月 07, 2005

 - added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd0fc66f

07 10月, 2005 2 次提交

[SCTP] Fix SCTP socket options to work with 32-bit apps on 64-bit kernels. · 20c9c825

由 Sridhar Samudrala 提交于 10月 06, 2005

Adds alignment attribute to a few structures used with SCTP socket
options so that the sizes and offsets remain the same when built using
either 32 or 64 bit tools.
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20c9c825

[SCTP] Fix sctp_get{pl}addrs() API to work with 32-bit apps on 64-bit kernels. · 5fe467ee

由 Ivan Skytte Jørgensen 提交于 10月 06, 2005

The old socket options are marked with a _OLD suffix so that the
existing 32-bit apps on 32-bit kernels do not break.
Signed-off-by: NIvan Skytte Jrgensen <isj-sctp@i1.dk>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5fe467ee

06 10月, 2005 1 次提交

[IPSEC]: Document that policy direction is derived from the index. · 77d8d7a6

由 Herbert Xu 提交于 10月 05, 2005

Here is a patch that adds a helper called xfrm_policy_id2dir to
document the fact that the policy direction can be and is derived
from the index.

This is based on a patch by YOSHIFUJI Hideaki and 210313105@suda.edu.cn.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77d8d7a6

05 10月, 2005 4 次提交

[XFRM]: fix sparse gfp nocast warnings · 83fa3400

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in xfrm code:
net/xfrm/xfrm_policy.c:232:47: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83fa3400

[IPVS]: fix sparse gfp nocast warnings · 8eea00a4

由 Randy Dunlap 提交于 10月 04, 2005

From: Randy Dunlap <rdunlap@xenotime.net>

Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eea00a4

[DECNET]: fix sparse gfp nocast warnings · f4a19a56

由 Randy Dunlap 提交于 10月 04, 2005

Fix implicit nocast warnings in decnet code:
net/decnet/af_decnet.c:458:40: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:125:35: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:219:29: warning: implicit cast to nocast type
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4a19a56

[INET]: Shrink struct inet_ehash_bucket on 32 bits UP · 6d255361

由 Eric Dumazet 提交于 10月 04, 2005

No need to align struct inet_ehash_bucket on a 8 bytes boundary.

On 32 bits Uniprocessor, that's a waste of 4 bytes per struct (50 %)

On other platforms, the attribute is useless, natual alignement is already 8.

platform     | Size before | Size after patch
-------------+-------------+------------------
32 bits, UP  |         8   |     4
32 bits, SMP |         8   |     8
64 bits, UP  |         8   |     8
64 bits, SMP |        16   |    16
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d255361

04 10月, 2005 1 次提交

[INET]: speedup inet (tcp/dccp) lookups · 81c3d547

由 Eric Dumazet 提交于 10月 03, 2005

Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &head->chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))
     ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
     ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))                 &&  \
     (inet_sk(__sk)->daddr          == (__saddr))   &&  \
     (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))


- Adds a prefetch(head->chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&head->lock);
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Acked-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81c3d547

03 10月, 2005 1 次提交
- J
  Lindent and trailing whitespace script executed ieee80211 subsystem · ff0037b2
  由 James Ketrenos 提交于 10月 03, 2005
```
Signed-off-by: NJames Ketrenos <jketreno@linux.intel.com>
```
  ff0037b2

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功