提交 · 37c3185a02d4b85fbe134bf5204535405dd2c957 · openanolis / cloud-kernel

23 6月, 2006 8 次提交

由 Herbert Xu 提交于 6月 22, 2006

This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device. For now it only supports in TCPv4.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37c3185a

[NET]: Add software TSOv4 · f4c50d99

由 Herbert Xu 提交于 6月 22, 2006

This patch adds the GSO implementation for IPv4 TCP.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4c50d99

[NET]: Add generic segmentation offload · f6a78bfc

由 Herbert Xu 提交于 6月 22, 2006

This patch adds the infrastructure for generic segmentation offload.
The idea is to tap into the potential savings of TSO without hardware
support by postponing the allocation of segmented skb's until just
before the entry point into the NIC driver.

The same structure can be used to support software IPv6 TSO, as well as
UFO and segmentation offload for other relevant protocols, e.g., DCCP.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6a78bfc

[NET]: Merge TSO/UFO fields in sk_buff · 7967168c

由 Herbert Xu 提交于 6月 22, 2006

Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7967168c

[NET]: Prevent transmission after dev_deactivate · d4828d85

由 Herbert Xu 提交于 6月 22, 2006

The dev_deactivate function has bit-rotted since the introduction of
lockless drivers. In particular, the spin_unlock_wait call at the end
has no effect on the xmit routine of lockless drivers.

With a little bit of work, we can make it much more useful by providing
the guarantee that when it returns, no more calls to the xmit routine
of the underlying driver will be made.

The idea is simple. There are two entry points in to the xmit routine.
The first comes from dev_queue_xmit. That one is easily stopped by
using synchronize_rcu. This works because we set the qdisc to noop_qdisc
before the synchronize_rcu call. That in turn causes all subsequent
packets sent to dev_queue_xmit to be dropped. The synchronize_rcu call
also ensures all outstanding calls leave their critical section.

The other entry point is from qdisc_run. Since we now have a bit that
indicates whether it's running, all we have to do is to wait until the
bit is off.

I've removed the loop to wait for __LINK_STATE_SCHED to clear. This is
useless because netif_wake_queue can cause it to be set again. It is
also harmless because we've disarmed qdisc_run.

I've also removed the spin_unlock_wait on xmit_lock because its only
purpose of making sure that all outstanding xmit_lock holders have
exited is also given by dev_watchdog_down.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4828d85

[IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY · 5e2707fa

由 YOSHIFUJI Hideaki 提交于 6月 22, 2006

We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY,
because we have more less significant rule; longest match.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e2707fa

[IPV6]: Fix source address selection. · 102128e3

由 Łukasz Stelmach 提交于 6月 22, 2006

Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses
are defined to make a distinction between global unicast
addresses and Unique Local Addresses (fc00::/7, RFC 4193) and
Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts
of connection that would either fail (eg. fec0:: to 2001:feed::)
or be sub-optimal (2001:0:: to 2001:feed::).
Signed-off-by: NŁukasz Stelmach <stlman@poczta.fm>
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

102128e3

[NET]: Avoid allocating skb in skb_pad · 5b057c6b

由 Herbert Xu 提交于 6月 23, 2006

First of all it is unnecessary to allocate a new skb in skb_pad since
the existing one is not shared.  More importantly, our hard_start_xmit
interface does not allow a new skb to be allocated since that breaks
requeueing.

This patch uses pskb_expand_head to expand the existing skb and linearize
it if needed.  Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.

Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
TCP, etc.).  As it is skb_pad will simply write over a cloned skb.  Because
of the position of the write it is unlikely to cause problems but still
it's best if we don't do it.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b057c6b

20 6月, 2006 4 次提交

[ATM]: fix broken uses of NIPQUAD in net/atm · ff7512e1

由 Al Viro 提交于 6月 20, 2006

NIPQUAD expects an l-value of type __be32, _NOT_ a pointer to __be32.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff7512e1

[SCTP]: sctp_unpack_cookie() fix · 8ca84481

由 Al Viro 提交于 6月 20, 2006

sizeof(pointer) != sizeof(array)...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ca84481

[NET]: Prevent multiple qdisc runs · 48d83325

由 Herbert Xu 提交于 6月 19, 2006

Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued.  It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting.  That in turn is needed for devices that spend a
lot of time in their transmit routine.

There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).

Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.

The solution here is to steal a bit from net_device to prevent this.

BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48d83325

[NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk length · d3dcd4ef

由 Patrick McHardy 提交于 6月 19, 2006

Fix endless loop in the SCTP match similar to those already fixed in
the SCTP conntrack helper (was CVE-2006-1527).
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3dcd4ef

18 6月, 2006 28 次提交

[ETHTOOL]: Fix UFO typo · 47552c4e

由 Herbert Xu 提交于 6月 17, 2006

The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of
ETHTOOL_GUFO.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47552c4e

[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer. · d5b9f4c0

由 Neil Horman 提交于 6月 17, 2006

In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5b9f4c0

[SCTP]: Send only 1 window update SACK per message. · d7c2c9e3

由 Tsutomu Fujii 提交于 6月 17, 2006

Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK.  When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.
Signed-off-by: NTsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7c2c9e3

[SCTP]: Don't do CRC32C checksum over loopback. · 503b55fd

由 Sridhar Samudrala 提交于 6月 17, 2006

Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

503b55fd

[SCTP] Reset rtt_in_progress for the chunk when processing its sack. · 4c9f5d53

由 Vlad Yasevich 提交于 6月 17, 2006

Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c9f5d53

[SCTP]: Reject sctp packets with broadcast addresses. · 5636bef7

由 Vlad Yasevich 提交于 6月 17, 2006

Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5636bef7

[SCTP]: Limit association max_retrans setting in setsockopt. · 402d68c4

由 Vlad Yasevich 提交于 6月 17, 2006

When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

402d68c4

[IPV6]: Sum real space for RTAs. · c5396a31

由 YOSHIFUJI Hideaki 提交于 6月 17, 2006

This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications.  Issue
pointed out by Patrick McHardy <kaber@trash.net>.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5396a31

[IRDA]: Use put_unaligned() in irlmp_do_discovery(). · b293acfd

由 David S. Miller 提交于 6月 17, 2006

irda_device_info->hints[] is byte aligned but is being
accessed as a u16

Based upon a patch by Luke Yang <luke.adi@gmail.com>.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b293acfd

[BRIDGE]: Add support for NETIF_F_HW_CSUM devices · 2c6cc0d8

由 Herbert Xu 提交于 6月 17, 2006

As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM.  This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c6cc0d8

[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM · 8648b305

由 Herbert Xu 提交于 6月 17, 2006

The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8648b305

[TCP]: Add tcp_slow_start_after_idle sysctl. · 35089bb2

由 David S. Miller 提交于 6月 13, 2006

A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35089bb2

[TCP] Westwood: reset RTT min after FRTO · bc726a71

由 Luca De Cicco 提交于 6月 11, 2006

RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.
Signed-off-by: NLuca De Cicco <ldecicco@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc726a71

[TCP] Westwood: bandwidth filter startup · b3a92eab

由 Luca De Cicco 提交于 6月 11, 2006

The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.
Signed-off-by: NLuca De Cicco <ldecicco@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3a92eab

[TCP] Westwood: comment fixes · b7d7a9e3

由 Luca De Cicco 提交于 6月 11, 2006

Cleanup some comments and add more references
Signed-off-by: NLuca De Cicco <ldecicco@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7d7a9e3

[TCP] Westwood: fix first sample · f61e2901

由 Stephen Hemminger 提交于 6月 11, 2006

Need to update send sequence number tracking after first ack.
Rework of patch from Luca De Cicco.
Signed-off-by: NStephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f61e2901

[NET]: net.ipv4.ip_autoconfig sysctl removal · bdeb04c6

由 Stephen Hemminger 提交于 6月 11, 2006

The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdeb04c6

[IPX]: Endian bug in ipxrtr_route_packet() · f8d59621

由 Alexey Dobriyan 提交于 6月 10, 2006

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8d59621

[NET]: Warn in __skb_trim if skb is paged · 3cc0e873

由 Herbert Xu 提交于 6月 09, 2006

It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3cc0e873

[NET]: skb_trim audit · b38dfee3

由 Herbert Xu 提交于 6月 09, 2006

I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b38dfee3

[NET]: Clean up skb_linearize · 364c6bad

由 Herbert Xu 提交于 6月 09, 2006

The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG => no frags)
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

364c6bad

[NET]: Add netif_tx_lock · 932ff279

由 Herbert Xu 提交于 6月 09, 2006

Various drivers use xmit_lock internally to synchronise with their
transmission routines.  They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set.  This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire.  So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner.  The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly.  I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond.  It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission.  This is
unsafe as an IRQ can potentially wake up the queue.  So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

932ff279

[NETFILTER]: hashlimit match: fix random initialization · bf0857ea

由 Patrick McHardy 提交于 6月 09, 2006

hashlimit does:

        if (!ht->rnd)
                get_random_bytes(&ht->rnd, 4);

ignoring that 0 is also a valid random number.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf0857ea

P
[NETFILTER]: recent match: missing refcnt initialization · 2b2283d0
由 Patrick McHardy 提交于 6月 09, 2006
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
2b2283d0

[NETFILTER]: recent match: fix "sleeping function called from invalid context" · a0e889bb

由 Patrick McHardy 提交于 6月 09, 2006

create_proc_entry must not be called with locks held. Use a mutex
instead to protect data only changed in user context.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0e889bb

[SECMARK]: Add CONNSECMARK xtables target · 100468e9

由 James Morris 提交于 6月 09, 2006

Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets.  This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.

A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

100468e9

[SECMARK]: Add secmark support to conntrack · 7c9728c3

由 James Morris 提交于 6月 09, 2006

Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c9728c3

[SECMARK]: Add xtables SECMARK target · 5e6874cd

由 James Morris 提交于 6月 09, 2006

Add a SECMARK target to xtables, allowing the admin to apply security
marks to packets via both iptables and ip6tables.

The target currently handles SELinux security marking, but can be
extended for other purposes as needed.
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e6874cd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功