提交 · b2e4b3debc327a5b53d9622e0b1785eea2ea2aad · openeuler / raspberrypi-kernel

02 9月, 2009 8 次提交

tcp: MD5 operations should be const · b2e4b3de

由 Stephen Hemminger 提交于 9月 01, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2e4b3de

net: seq_operations should be const · 98147d52

由 Stephen Hemminger 提交于 9月 01, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98147d52

ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDS · 06254914

由 Eric Dumazet 提交于 9月 01, 2009

qdisc drops should be notified to IP_RECVERR enabled sockets, as done in IPV4.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06254914

drop_monitor: fix trace_napi_poll_hit() · f2798eb4

由 Xiao Guangrong 提交于 8月 30, 2009

The net_dev of backlog napi is NULL, like below:

__get_cpu_var(softnet_data).backlog.dev == NULL

So, we should check it in napi tracepoint's probe function
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2798eb4

pkt_sched: Revert tasklet_hrtimer changes. · 2fbd3da3

由 David S. Miller 提交于 9月 01, 2009

These are full of unresolved problems, mainly that conversions don't
work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers
tasklets can't be killed from softirq context.

And when a qdisc gets reset, that's exactly what we need to do here.

We'll work this out in the net-next-2.6 tree and if warranted we'll
backport that work to -stable.

This reverts the following 3 changesets:

a2cb6a4d
("pkt_sched: Fix bogon in tasklet_hrtimer changes.")

38acce2d
("pkt_sched: Convert CBQ to tasklet_hrtimer.")

ee5f9757
("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2fbd3da3

net: sk_free() should be allowed right after sk_alloc() · d66ee058

由 Jarek Poplawski 提交于 8月 30, 2009

After commit 2b85a34e
(net: No more expensive sock_hold()/sock_put() on each tx)
sk_free() frees socks conditionally and depends
on sk_wmem_alloc being set e.g. in sock_init_data(). But in some
cases sk_free() is called earlier, usually after other alloc errors.

Fix is to move sk_wmem_alloc initialization from sock_init_data()
to sk_alloc() itself.
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d66ee058

net: make neigh_ops constant · 89d69d2b

由 Stephen Hemminger 提交于 9月 01, 2009

These tables are never modified at runtime. Move to read-only
section.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89d69d2b

netns: embed ip6_dst_ops directly · 86393e52

由 Alexey Dobriyan 提交于 8月 29, 2009

struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated,
but there is no fundamental reason for it. Embed it directly into
struct netns_ipv6.

For that:
* move struct dst_ops into separate header to fix circular dependencies
	I honestly tried not to, it's pretty impossible to do other way
* drop dynamical allocation, allocate together with netns

For a change, remove struct dst_ops::dst_net, it's deducible
by using container_of() given dst_ops pointer.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86393e52

01 9月, 2009 12 次提交

Revert Backoff [v3]: Calculate TCP's connection close threshold as a time value. · 6fa12c85

由 Damian Lukowski 提交于 8月 26, 2009

RFC 1122 specifies two threshold values R1 and R2 for connection timeouts,
which may represent a number of allowed retransmissions or a timeout value.
Currently linux uses sysctl_tcp_retries{1,2} to specify the thresholds
in number of allowed retransmissions.

For any desired threshold R2 (by means of time) one can specify tcp_retries2
(by means of number of retransmissions) such that TCP will not time out
earlier than R2. This is the case, because the RTO schedule follows a fixed
pattern, namely exponential backoff.

However, the RTO behaviour is not predictable any more if RTO backoffs can be
reverted, as it is the case in the draft
"Make TCP more Robust to Long Connectivity Disruptions"
(http://tools.ietf.org/html/draft-zimmermann-tcp-lcd).

In the worst case TCP would time out a connection after 3.2 seconds, if the
initial RTO equaled MIN_RTO and each backoff has been reverted.

This patch introduces a function retransmits_timed_out(N),
which calculates the timeout of a TCP connection, assuming an initial
RTO of MIN_RTO and N unsuccessful, exponentially backed-off retransmissions.

Whenever timeout decisions are made by comparing the retransmission counter
to some value N, this function can be used, instead.

The meaning of tcp_retries2 will be changed, as many more RTO retransmissions
can occur than the value indicates. However, it yields a timeout which is
similar to the one of an unpatched, exponentially backing off TCP in the same
scenario. As no application could rely on an RTO greater than MIN_RTO, there
should be no risk of a regression.
Signed-off-by: NDamian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fa12c85

Revert Backoff [v3]: Revert RTO on ICMP destination unreachable · f1ecd5d9

由 Damian Lukowski 提交于 8月 26, 2009

Here, an ICMP host/network unreachable message, whose payload fits to
TCP's SND.UNA, is taken as an indication that the RTO retransmission has
not been lost due to congestion, but because of a route failure
somewhere along the path.
With true congestion, a router won't trigger such a message and the
patched TCP will operate as standard TCP.

This patch reverts one RTO backoff, if an ICMP host/network unreachable
message, whose payload fits to TCP's SND.UNA, arrives.
Based on the new RTO, the retransmission timer is reset to reflect the
remaining time, or - if the revert clocked out the timer - a retransmission
is sent out immediately.
Backoffs are only reverted, if TCP is in RTO loss recovery, i.e. if
there have been retransmissions and reversible backoffs, already.

Changes from v2:
1) Renaming of skb in tcp_v4_err() moved to another patch.
2) Reintroduced tcp_bound_rto() and __tcp_set_rto().
3) Fixed code comments.
Signed-off-by: NDamian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1ecd5d9

Revert Backoff [v3]: Rename skb to icmp_skb in tcp_v4_err() · 4d1a2d9e

由 Damian Lukowski 提交于 8月 26, 2009

This supplementary patch renames skb to icmp_skb in tcp_v4_err() in order to
disambiguate from another sk_buff variable, which will be introduced
in a separate patch.
Signed-off-by: NDamian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d1a2d9e

dcbnl: Add implementations of dcbnl setapp/getapp commands · 57949686

由 Yi Zou 提交于 8月 31, 2009

Implements the dcbnl netlink setapp/getapp pair. When a setapp/getapp
is received, dcbnl would just pass on to dcbnl_rtnl_op.setapp/getapp
that are supposed to be implemented by the low level drivers.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Acked-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57949686

dcbnl: Add netlink attributes for setapp/getapp to dcbnl · 6fa382af

由 Yi Zou 提交于 8月 31, 2009

Add defines for dcbnl netlink attributes to support netlink message passing of
setapp/getapp in dcbnl.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Acked-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fa382af

vlan: Add support for net_devices_ops.ndo_fcoe_enable/_disable to VLAN · 0af46d99

由 Yi Zou 提交于 8月 31, 2009

This adds implementation of the net_devices_ops.ndo_fcoe_enable/_disable to
the VLAN driver. It checks if the real_dev has support for ndo_fcoe_enable/
ndo_fcoe_disable and if so, passes on to call the associated real_dev.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0af46d99

wireless: convert drivers to netdev_tx_t · d0cf9c0d

由 Stephen Hemminger 提交于 8月 31, 2009

Mostly just simple conversions:
  * ray_cs had bogus return of NET_TX_LOCKED but driver
    was not using NETIF_F_LLTX
  * hostap and ipw2x00 had some code that returned value
    from a called function that also had to change to return netdev_tx_t
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0cf9c0d

netdev: convert pseudo drivers to netdev_tx_t · 424efe9c

由 Stephen Hemminger 提交于 8月 31, 2009

These are all drivers that don't touch real hardware.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

424efe9c

irda: convert to netdev_tx_t · 6518bbb8

由 Stephen Hemminger 提交于 8月 31, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6518bbb8

convert hamradio drivers to netdev_txreturnt_t · 36e4d64a

由 Stephen Hemminger 提交于 8月 31, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36e4d64a

convert ATM drivers to netdev_tx_t · 3c805a22

由 Stephen Hemminger 提交于 8月 31, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c805a22

netdev: convert pseudo-devices to netdev_tx_t · 6fef4c0c

由 Stephen Hemminger 提交于 8月 31, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fef4c0c

31 8月, 2009 3 次提交

pkt_sched: Fix resource limiting in pfifo_fast · a453e068

由 Krishna Kumar 提交于 8月 30, 2009

pfifo_fast_enqueue has this check:
        if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {

which allows each band to enqueue upto tx_queue_len skbs for a
total of 3*tx_queue_len skbs. I am not sure if this was the
intention of limiting in qdisc.

Patch compiled and 32 simultaneous netperf testing ran fine. Also:
# tc -s qdisc show dev eth2
qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) 
 rate 0bit 0pps backlog 0b 0p requeues 25 
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a453e068

net: convert remaining non-symbolic return values in dev_queue_xmit · 03a9a447

由 Krishna Kumar 提交于 8月 29, 2009

Patch compiled and 32 simultaneous netperf testing ran fine.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03a9a447

can: use correct NET_RX_ return values · 6ca8b990

由 Oliver Hartkopp 提交于 8月 29, 2009

Dropped skb's should be documented by an appropriate return value.
Use the correct NET_RX_DROP and NET_RX_SUCCESS values for that reason.
Signed-off-by: NOliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ca8b990

29 8月, 2009 17 次提交

tipc: fix test of bearer_priority range in tipc_register_media() · b3df9a51

由 roel kluin 提交于 8月 27, 2009

For the bearer_priority to be less than TIPC_MIN_LINK_PRI and greater than
TIPC_MAX_LINK_PRI is logically impossible.
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3df9a51

can: switch to seq_file · ea00b8e2

由 Alexey Dobriyan 提交于 8月 28, 2009

create_proc_read_entry() is going to be removed soon.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea00b8e2

tcp: Remove redundant copy of MD5 authentication key · 9a7030b7

由 John Dykstra 提交于 8月 19, 2009

Remove the copy of the MD5 authentication key from tcp_check_req().
This key has already been copied by tcp_v4_syn_recv_sock() or
tcp_v6_syn_recv_sock().
Signed-off-by: NJohn Dykstra <john.dykstra1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a7030b7

Speed-up pfifo_fast lookup using a private bitmap · fd3ae5e8

由 Krishna Kumar 提交于 8月 18, 2009

Maintain a per-qdisc bitmap for pfifo_fast giving  availability
of skbs for each band. This allows faster lookup for a skb when
there are no high priority skbs. Also, it helps in (rare) cases
when there are no skbs on the list, where an immediate lookup is
faster than iterating through the three bands.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd3ae5e8

ipv6: Update Neighbor Cache when IPv6 RA is received on a router · 31ce8c71

由 David Ward 提交于 8月 29, 2009

When processing a received IPv6 Router Advertisement, the kernel
creates or updates an IPv6 Neighbor Cache entry for the sender --
but presently this does not occur if IPv6 forwarding is enabled
(net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements
are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these
cases processing of the Router Advertisement has already halted.

This patch allows the Neighbor Cache to be updated in these cases,
while still avoiding any modification to routes or link parameters.

This continues to satisfy RFC 4861, since any entry created in the
Neighbor Cache as the result of a received Router Advertisement is
still placed in the STALE state.
Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31ce8c71

tcp: fix premature termination of FIN_WAIT2 time-wait sockets · 80a1096b

由 Octavian Purdila 提交于 8月 29, 2009

There is a race condition in the time-wait sockets code that can lead
to premature termination of FIN_WAIT2 and, subsequently, to RST
generation when the FIN,ACK from the peer finally arrives:

Time     TCP header
0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0

Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
instance inet_twdr_do_twkill_work returns 1. At that point we will
mark slot 1 and schedule inet_twdr_twkill_work. We will also make
twdr->slot = 2.

Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
is called which will create a new FIN_WAIT2 time-wait socket and will
place it in the last to be reached slot, i.e. twdr->slot = 1.

At this point say inet_twdr_twkill_work will run which will start
destroying the time-wait sockets in slot 1, including the just added
TCP_FIN_WAIT2 one.

To avoid this issue we increment the slot only if all entries in the
slot have been purged.

This change may delay the slots cleanup by a time-wait death row
period but only if the worker thread didn't had the time to run/purge
the current slot in the next period (6 seconds with default sysctl
settings). However, on such a busy system even without this change we
would probably see delays...
Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80a1096b

fib_trie: resize rework · 80b71b80

由 Jens Låås 提交于 8月 28, 2009

Here is rework and cleanup of the resize function.

Some bugs we had. We were using ->parent when we should use 
node_parent(). Also we used ->parent which is not assigned by
inflate in inflate loop.

Also a fix to set thresholds to power 2 to fit halve 
and double strategy.

max_resize is renamed to max_work which better indicates
it's function.

Reaching max_work is not an error, so warning is removed. 
max_work only limits amount of work done per resize.
(limits CPU-usage, outstanding memory etc).

The clean-up makes it relatively easy to add fixed sized 
root-nodes if we would like to decrease the memory pressure
on routers with large routing tables and dynamic routing.
If we'll need that...

Its been tested with 280k routes.

Work done together with Robert Olsson.
Signed-off-by: NJens Låås <jens.laas@its.uu.se>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80b71b80

sit: allow ip fragmentation when using nopmtudisc to fix package loss · 8945a808

由 Sascha Hlusiak 提交于 8月 28, 2009

if tunnel parameters have frag_off set to IP_DF, pmtudisc on the ipv4 link
will be performed by deriving the mtu from the ipv4 link and setting the
DF-Flag of the encapsulating IPv4 Header. If fragmentation is needed on the
way, the IPv4 pmtu gets adjusted, the ipv6 package will be resent eventually,
using the new and lower mtu and everyone is happy.

If the frag_off parameter is unset, the mtu for the tunnel will be derived
from the tunnel device or the ipv6 pmtu, which might be higher than the ipv4
pmtu. In that case we must allow the fragmentation of the IPv4 packet because
the IPv6 mtu wouldn't 'learn' from the adjusted IPv4 pmtu, resulting in
frequent icmp_frag_needed and package loss on the IPv6 layer.

This patch allows fragmentation when tunnel was created with parameter
nopmtudisc, like in ipip/gre tunnels.
Signed-off-by: NSascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8945a808

net: ip_rt_send_redirect() optimization · 30038fc6

由 Eric Dumazet 提交于 8月 28, 2009

While doing some forwarding benchmarks, I noticed
ip_rt_send_redirect() is rather expensive, even if send_redirects is
false for the device.

Fix is to avoid two atomic ops, we dont really need to take a
reference on in_dev
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30038fc6

tcp: keepalive cleanups · df19a626

由 Eric Dumazet 提交于 8月 28, 2009

Introduce keepalive_probes(tp) helper, and use it, like 
keepalive_time_when(tp) and keepalive_intvl_when(tp)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df19a626

ipv4: af_inet.c cleanups · 3d1427f8

由 Eric Dumazet 提交于 8月 28, 2009

Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d1427f8

pktgen: use proc_create_data() · 2975315b

由 Alexey Dobriyan 提交于 8月 28, 2009

It looks like after rename device proc entry is unusable,
because of no ->read_proc or ->proc_fops.

And create_proc_entry() is deprecated.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2975315b

pktgen: increase version · c3d2f52d

由 Stephen Hemminger 提交于 8月 27, 2009

Increase module version, and cleanup module info.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3d2f52d

pktgen: cleanup checkpatch warnings · 63adc6fb

由 Stephen Hemminger 提交于 8月 27, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63adc6fb

pktgen: use common idle routine · 64e8ff5e

由 Stephen Hemminger 提交于 8月 27, 2009

Simpler to have one place that spins and accounts for delays,
this will also make the last packet be detected faster for more
repeatable timing.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64e8ff5e

pktgen: spin using hrtimer · 2bc481cf

由 Stephen Hemminger 提交于 8月 28, 2009

This changes how the pktgen thread spins/waits between
packets if delay is configured. It uses a high res timer to
wait for time to arrive.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bc481cf

pktgen: convert to use ktime_t · fd29cf72

由 Stephen Hemminger 提交于 8月 27, 2009

The kernel ktime_t is a nice generic infrastructure for mananging
high resolution times, as is done in pktgen.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd29cf72