提交 · e8648a1fdb54da1f683784b36a17aa65ea56e931 · openeuler / Kernel

23 7月, 2010 4 次提交

由 Eric Dumazet 提交于 7月 23, 2010

In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
        -j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
        -j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
        -j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
        -j REDIRECT --to-port 8083
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

e8648a1f

IPVS: make FTP work with full NAT support · 7f1c4075

由 Hannes Eder 提交于 7月 23, 2010

Use nf_conntrack/nf_nat code to do the packet mangling and the TCP
sequence adjusting.  The function 'ip_vs_skb_replace' is now dead
code, so it is removed.

To SNAT FTP, use something like:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vport 21 -j SNAT --to-source 192.168.10.10
and for the data connections in passive mode:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vportctl 21 -j SNAT --to-source 192.168.10.10
using '-m state --state RELATED' would also works.

Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and
nf_nat_ftp are loaded.

[ up-port and minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: NHannes Eder <heder@google.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

7f1c4075

IPVS: make friends with nf_conntrack · 7b215ffc

由 Hannes Eder 提交于 7月 23, 2010

Update the nf_conntrack tuple in reply direction, as we will see
traffic from the real server (RIP) to the client (CIP).  Once this is
done we can use netfilters SNAT in POSTROUTING, especially with
xt_ipvs, to do source NAT, e.g.:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vport 80 \
		  -j SNAT --to-source 192.168.10.10

[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: NHannes Eder <heder@google.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

7b215ffc

netfilter: xt_ipvs (netfilter matcher for IPVS) · 9c3e1c39

由 Hannes Eder 提交于 7月 23, 2010

This implements the kernel-space side of the netfilter matcher xt_ipvs.

[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: NHannes Eder <heder@google.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>
[ Patrick: added xt_ipvs.h to Kbuild ]
Signed-off-by: NPatrick McHardy <kaber@trash.net>

9c3e1c39

15 7月, 2010 2 次提交

netfilter: add CHECKSUM target · edf0e1fb

由 Michael S. Tsirkin 提交于 7月 15, 2010

This adds a `CHECKSUM' target, which can be used in the iptables mangle
table.

You can use this target to compute and fill in the checksum in
a packet that lacks a checksum.  This is particularly useful,
if you need to work around old applications such as dhcp clients,
that do not work well with checksum offloads, but don't want to
disable checksum offload in your device.

The problem happens in the field with virtualized applications.
For reference, see Red Hat bz 605555, as well as
http://www.spinics.net/lists/kvm/msg37660.html

Typical expected use (helps old dhclient binary running in a VM):
iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
	-j CHECKSUM --checksum-fill

Includes fixes by Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

edf0e1fb

netfilter: nf_ct_tcp: fix flow recovery with TCP window tracking enabled · fac42a9a

由 Pablo Neira Ayuso 提交于 7月 15, 2010

This patch adds the missing bits to support the recovery of TCP flows
without disabling window tracking (aka be_liberal). To ensure a
successful recovery, we have to inject the window scale factor via
ctnetlink.

This patch has been tested with a development snapshot of conntrackd
and the new clause `TCPWindowTracking' that allows to perform strict
TCP window tracking recovery across fail-overs.

With this patch, we don't update the receiver's window until it's not
initiated. We require this to perform a successful recovery. Jozsef
confirmed in a private email that this spotted a real issue since that
should not happen.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

fac42a9a

09 7月, 2010 2 次提交

netfilter: xt_TPROXY: the length of lines should be within 80 · 116e1f1b

由 Changli Gao 提交于 7月 09, 2010

According to the Documentation/CodingStyle, the length of lines should
be within 80.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

116e1f1b

ipvs: lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_out · 8a0acaac

由 Xiaoyu Du 提交于 7月 09, 2010

lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_out
Since there's no sctp helpers at present, it does the same thing as
ip_vs_app_pkt_in.
Signed-off-by: NXiaoyu Du <tingsrain@gmail.com>
Acked-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

8a0acaac

05 7月, 2010 5 次提交

ipvs: Kconfig cleanup · 72c7664f

由 Michal Marek 提交于 7月 05, 2010

IP_VS_PROTO_AH_ESP should be set iff either of IP_VS_PROTO_{AH,ESP} is
selected. Express this with standard kconfig syntax.
Signed-off-by: NMichal Marek <mmarek@suse.cz>
Acked-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

72c7664f

netfilter: ipt_REJECT: avoid touching dst ref · b13b7125

由 Eric Dumazet 提交于 7月 05, 2010

We can avoid a pair of atomic ops in ipt_REJECT send_reset()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b13b7125

netfilter: ipt_REJECT: postpone the checksum calculation. · 98b0e84a

由 Changli Gao 提交于 7月 05, 2010

postpone the checksum calculation, then if the output NIC supports checksum
offloading, we can utlize it. And though the output NIC doesn't support
checksum offloading, but we'll mangle this packet, this can free us from
updating the checksum, as the checksum calculation occurs later.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

98b0e84a

netfilter: nf_conntrack_reasm: add fast path for in-order fragments · ea8fbe8f

由 Changli Gao 提交于 7月 05, 2010

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ea8fbe8f

netdevice.h net/core/dev.c: Convert netdev_<level> logging macros to functions · 256df2f3

由 Joe Perches 提交于 6月 27, 2010

Reduces an x86 defconfig text and data ~2k.
text is smaller, data is larger.

$ size vmlinux*
   text	   data	    bss	    dec	    hex	filename
7198862	 720112	1366288	9285262	 8dae8e	vmlinux
7205273	 716016	1366288	9287577	 8db799	vmlinux.device_h

Uses %pV and struct va_format
Format arguments are verified before printk
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

256df2f3

02 7月, 2010 1 次提交

bridge: add per bridge device controls for invoking iptables · 4df53d8b

由 Patrick McHardy 提交于 7月 02, 2010

Support more fine grained control of bridge netfilter iptables invocation
by adding seperate brnf_call_*tables parameters for each device using the
sysfs interface. Packets are passed to layer 3 netfilter when either the
global parameter or the per bridge parameter is enabled.
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

4df53d8b

01 7月, 2010 10 次提交

ethtool: Add support for control of RX flow hash indirection · a5b6ee29

由 Ben Hutchings 提交于 6月 30, 2010

Many NICs use an indirection table to map an RX flow hash value to one
of an arbitrary number of queues (not necessarily a power of 2). It
can be useful to remove some queues from this indirection table so
that they are only used for flows that are specifically filtered
there. It may also be useful to weight the mapping to account for
user processes with the same CPU-affinity as the RX interrupts.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5b6ee29

ethtool: Change ethtool_op_set_flags to validate flags · 1437ce39

由 Ben Hutchings 提交于 6月 30, 2010

ethtool_op_set_flags() does not check for unsupported flags, and has
no way of doing so.  This means it is not suitable for use as a
default implementation of ethtool_ops::set_flags.

Add a 'supported' parameter specifying the flags that the driver and
hardware support, validate the requested flags against this, and
change all current callers to pass this parameter.

Change some other trivial implementations of ethtool_ops::set_flags to
call ethtool_op_set_flags().
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Reviewed-by: NStanislaw Gruszka <sgruszka@redhat.com>
Acked-by: NJeff Garzik <jgarzik@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1437ce39

fragment: add fast path for in-order fragments · d6bebca9

由 Changli Gao 提交于 6月 29, 2010

add fast path for in-order fragments

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/inet_frag.h |    1 +
 net/ipv4/ip_fragment.c  |   12 ++++++++++++
 net/ipv6/reassembly.c   |   11 +++++++++++
 3 files changed, 24 insertions(+)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6bebca9

snmp: 64bit ipstats_mib for all arches · 4ce3c183

由 Eric Dumazet 提交于 6月 30, 2010

/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ce3c183

act_nat: use stack variable · 504f85c9

由 Changli Gao 提交于 6月 29, 2010

act_nat: use stack variable

structure tc_nat isn't too big for stack, so we can put it in stack.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 net/sched/act_nat.c |   31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

504f85c9

act_mirred: combine duplicate code · 5acbf7f1

由 Changli Gao 提交于 6月 29, 2010

act_mirred: combine duplicate code

tcf_bstats is updated in any way, so we can do it earlier to reduce the size of
the code.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
----
 net/sched/act_mirred.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5acbf7f1

mac82011: Allow selection of minstrel_ht as default rc algorithm · 92b50c4b

由 Helmut Schaa 提交于 6月 30, 2010

Allow selection of minstrel_ht as default rate control algorithm. At
the moment minstrel_ht can only be requested by the driver code but
not selected as default in make menuconfig. Fix this by using
minstrel_ht when minstrel was selected as default and minstrel_ht
is available.

This change won't affect legacy devices as minstrel_ht falls back to
minstrel in that case.
Signed-off-by: NHelmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

92b50c4b

net/core: use ntohs for skb->protocol · 70777d03

由 Sebastian Andrzej Siewior 提交于 6月 30, 2010

This is only noticed by people that are not doing everything correct in
the first place.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70777d03

ipv6: Use interface max_desync_factor instead of static default · 784e2710

由 Ben Hutchings 提交于 6月 26, 2010

max_desync_factor can be configured per-interface, but nothing is
using the value.
Reported-by: NPiotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

784e2710

ipv6: Clamp reported valid_lft to a minimum of 0 · f56619fc

由 Ben Hutchings 提交于 6月 26, 2010

Since addresses are only revalidated every 2 minutes, the reported
valid_lft can underflow shortly before the address is deleted.
Clamp it to a minimum of 0, as for prefered_lft.
Reported-by: NPiotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f56619fc

30 6月, 2010 3 次提交

net/Makefile: conditionally descend to wireless and ieee802154 · d1e31689

由 Nicolas Kaiser 提交于 6月 27, 2010

Don't descend to wireless and ieee802154 unless they are actually used.
Signed-off-by: NNicolas Kaiser <nikai@nikai.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1e31689

mac80211: add basic tracing to drv_get_survey · c466d4ef

由 John W. Linville 提交于 6月 29, 2010

Reported-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

c466d4ef

mac80211: remove unnecessary check in ieee80211_dump_survey · ff3074a4

由 John W. Linville 提交于 6月 29, 2010

This check is duplicated in drv_get_survey.
Reported-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

ff3074a4

29 6月, 2010 8 次提交

caif: Kconfig and Makefile fixes · 01eebb53

由 Sjur Braendeland 提交于 6月 26, 2010

Use "depends on" instead of "if" in Kconfig files.
Fixed CAIF debug flag, and removed unnecessary clean-* options.
Signed-off-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01eebb53

act_mirred: don't clone skb when skb isn't shared · 210d6de7

由 Changli Gao 提交于 6月 24, 2010

don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

210d6de7

tcp: tso_fragment() might avoid GFP_ATOMIC · c4ead4c5

由 Eric Dumazet 提交于 6月 24, 2010

We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC
allocations sometimes.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4ead4c5

vlan: 64 bit rx counters · 9618e2ff

由 Eric Dumazet 提交于 6月 24, 2010

Use u64_stats_sync infrastructure to implement 64bit rx stats.

(tx stats are addressed later)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9618e2ff

net: use this_cpu_ptr() · 7a9b2d59

由 Eric Dumazet 提交于 6月 24, 2010

use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id())
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a9b2d59

mac80211: fix the for_each_sta_info macro · 38bdb650

由 Felix Fietkau 提交于 6月 25, 2010

Because of an ambiguity in the for_each_sta_info macro, it can
currently only be used if the third parameter is set to 'sta'.
Fix this by renaming the parameter to '_sta'.
Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

38bdb650

J
mac80211: use netif_receive_skb in ieee80211_tx_status callpath · 5ed3bc72
由 John W. Linville 提交于 6月 24, 2010
```
This avoids the extra queueing from calling netif_rx.
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
```
5ed3bc72
J
mac80211: use netif_receive_skb in ieee80211_rx callpath · 5548a8a1
由 John W. Linville 提交于 6月 24, 2010
```
This avoids the extra queueing from calling netif_rx.
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
```
5548a8a1

28 6月, 2010 2 次提交

netfilter: ipt_LOG/ip6t_LOG: add option to print decoded MAC header · 7eb9282c

由 Patrick McHardy 提交于 6月 28, 2010

The LOG targets print the entire MAC header as one long string, which is not
readable very well:

IN=eth0 OUT= MAC=00:15:f2:24:91:f8:00:1b:24:dc:61:e6:08:00 ...

Add an option to decode known header formats (currently just ARPHRD_ETHER devices)
in their individual fields:

IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=0800 ...
IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=86dd ...

The option needs to be explicitly enabled by userspace to avoid breaking
existing parsers.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

7eb9282c

netfilter: ipt_LOG/ip6t_LOG: remove comparison within loop · cf377eb4

由 Patrick McHardy 提交于 6月 28, 2010

Remove the comparison within the loop to print the macheader by prepending
the colon to all but the first printk.

Based on suggestion by Jan Engelhardt <jengelh@medozas.de>.
Signed-off-by: NPatrick McHardy <kaber@trash.net>

cf377eb4

27 6月, 2010 2 次提交

syncookies: add support for ECN · 172d69e6

由 Florian Westphal 提交于 6月 21, 2010

Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

172d69e6

syncookies: do not store rcv_wscale in tcp timestamp · 734f614b

由 Florian Westphal 提交于 6月 21, 2010

As pointed out by Fernando Gont there is no need to encode rcv_wscale
into the cookie.

We did not use the restored rcv_wscale anyway; it is recomputed
via tcp_select_initial_window().

Thus we can save 4 bits in the ts option space by removing rcv_wscale.
In case window scaling was not supported, we set the (invalid) wscale
value 0xf.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

734f614b

26 6月, 2010 1 次提交

ipv6: remove ipv6_statistics · 9587c6dd

由 Eric Dumazet 提交于 6月 23, 2010

commit 9261e537 (ipv6: making ip and icmp statistics per/namespace)
forgot to remove ipv6_statistics variable.

commit bc417d99 (ipv6: remove stale MIB definitions) took care of
icmpv6_statistics & icmpv6msg_statistics
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9587c6dd

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功