提交 · bd16a6cce2a7f169b559abc5672fd2c66e91fb36 · openeuler / Kernel

05 1月, 2012 3 次提交

net_sched: sfq: fix mem alloc error recovery · bd16a6cc

由 Eric Dumazet 提交于 1月 04, 2012

Since commit 817fb15d (net_sched: sfq: allow divisor to be a
parameter), we can leave perturbation timer armed if a memory allocation
error aborts sfq_init().

Memory containing active struct timer_list is freed and kernel can
crash.

Call sfq_destroy() from sfq_init() to properly dismantle qdisc.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd16a6cc

ethtool: Remove ethtool_ops::set_rx_ntuple operation · 6cfb5e75

由 Ben Hutchings 提交于 1月 03, 2012

All implementations have been converted to implement set_rxnfc
instead.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6cfb5e75

ethtool: Allow drivers to select RX NFC rule locations · 55664f32

由 Ben Hutchings 提交于 1月 03, 2012

Define special location values for RX NFC that request the driver to
select the actual rule location.  This allows for implementation on
devices that use hash-based filter lookup, whereas currently the API is
more suited to devices with TCAM lookup or linear search.

In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
the structure back to user-space after insertion so that the actual
location is returned.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55664f32

04 1月, 2012 3 次提交

net_sched: qdisc_alloc_handle() can be too slow · fa0f5aa7

由 Eric Dumazet 提交于 1月 03, 2012

When trying to allocate ~32768 qdiscs using autohandle mechanism, we can
fill the space managed by kernel (handles in [8000-FFFF]:0000 range)

But O(N^2) qdisc_alloc_handle() loops 0x10000 times instead of 0x8000

time tc add qdisc add dev eth0 parent 10:7fff pfifo limit 10
RTNETLINK answers: Cannot allocate memory
real    1m54.826s
user    0m0.000s
sys     0m0.004s

INFO: rcu_sched_state detected stall on CPU 0 (t=60000 jiffies)

Half number of loops, and add a cond_resched() call.
We hold rtnl at this point.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Dave Taht <dave.taht@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa0f5aa7

sch_qfq: accurate wsum handling · d32ae76f

由 Eric Dumazet 提交于 1月 02, 2012

We can underestimate q->wsum in case of "tc class replace ... qfq"
and/or qdisc_create_dflt() error.

wsum is not really used in fast path, only at qfq qdisc/class setup,
to catch user error.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d32ae76f

sch_sfq: dont put new flow at the end of flows · d47a0ac7

由 Eric Dumazet 提交于 1月 01, 2012

SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
circular list. In fact this is probably an old SFQ implementation bug.

100 Mbits = ~8333 full frames per second, or ~8 frames per ms.

With 50 flows, it means your "new flow" will have to wait 50 packets
being sent before its own packet. Thats the ~6ms.

We certainly can change SFQ to give a priority advantage to new flows,
so that next dequeued packet is taken from a new flow, not an old one.
Reported-by: NDave Taht <dave.taht@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d47a0ac7

01 1月, 2012 1 次提交

netfilter: nfnetlink_acct: fix nfnl_acct_get operation · 3ab0b245

由 Pablo Neira Ayuso 提交于 12月 30, 2011

The get operation was not sending the message that was built to
user-space. This patch also includes the appropriate handling for
the return value of netlink_unicast().

Moreover, fix error codes on error (for example, for non-existing
entry was uncorrect).
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3ab0b245

31 12月, 2011 9 次提交

netfilter: ctnetlink: fix timeout calculation · c1216382

由 Xi Wang 提交于 12月 30, 2011

The sanity check (timeout < 0) never works; the dividend is unsigned
and so is the division, which should have been a signed division.

	long timeout = (ct->timeout.expires - jiffies) / HZ;
	if (timeout < 0)
		timeout = 0;

This patch converts the time values to signed for the division.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c1216382

ipvs: try also real server with port 0 in backup server · 52793dbe

由 Julian Anastasov 提交于 12月 30, 2011

	We should not forget to try for real server with port 0
in the backup server when processing the sync message. We should
do it in all cases because the backup server can use different
forwarding method.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

52793dbe

netem: fix classful handling · 50612537

由 Eric Dumazet 提交于 12月 28, 2011

Commit 10f6dfcf (Revert "sch_netem: Remove classful functionality")
reintroduced classful functionality to netem, but broke basic netem
behavior :

netem uses an t(ime)fifo queue, and store timestamps in skb->cb[]

If qdisc is changed, time constraints are not respected and other qdisc
can destroy skb->cb[] and block netem at dequeue time.

Fix this by always using internal tfifo, and optionally attach a child
qdisc to netem (or a tree of qdiscs)

Example of use :

DEV=eth3
tc qdisc del dev $DEV root
tc qdisc add dev $DEV root handle 30: est 1sec 8sec netem delay 20ms 10ms
tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \
	burst 20480 limit 20480 mtu 1514 rate 32000bps

qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms  10.0ms
 Sent 190792 bytes 413 pkt (dropped 0, overlimits 0 requeues 0)
 rate 18416bit 3pps backlog 0b 0p requeues 0
qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us
 Sent 190792 bytes 413 pkt (dropped 6, overlimits 10 requeues 0)
 backlog 0b 5p requeues 0
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50612537

IPv6: Avoid taking write lock for /proc/net/ipv6_route · 32b293a5

由 Josh Hunt 提交于 12月 28, 2011

During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which uses
"write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
found this on 2.6.32, but reading the code I believe the same basic idea
exists currently. Looking at the rtnetlink code they are only calling
"read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
reading from proc isn't the recommended way of fetching the ipv6 route
table; taking a write lock seems unnecessary and would probably cause
network performance issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3
cases:
  * doing nothing
  * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs each (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using this
info I decided to write a version of fib6_clean_all() which replaces
write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
this new function I see the same results as with my rtnetlink iperf test.
Signed-off-by: NJosh Hunt <joshhunt00@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32b293a5

unix_diag: Fixup RQLEN extension report · c9da99e6

由 Pavel Emelyanov 提交于 12月 30, 2011

While it's not too late fix the recently added RQLEN diag extension
to report rqlen and wqlen in the same way as TCP does.

I.e. for listening sockets the ack backlog length (which is the input
queue length for socket) in rqlen and the max ack backlog length in
wqlen, and what the CINQ/OUTQ ioctls do for established.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9da99e6

af_unix: Move CINQ/COUTQ code to helpers · 885ee74d

由 Pavel Emelyanov 提交于 12月 30, 2011

Currently tcp diag reports rqlen and wqlen values similar to how
the CINQ/COUTQ iotcls do. To make unix diag report these values
in the same way move the respective code into helpers.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

885ee74d

unix_diag: Add the MEMINFO extension · 257b5298

由 Pavel Emelyanov 提交于 12月 30, 2011

[ Fix indentation of sock_diag*() calls. -DaveM ]
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

257b5298

inet_diag: Add the SKMEMINFO extension · c0636faa

由 Pavel Emelyanov 提交于 12月 30, 2011

Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0636faa

sock_diag: Introduce the meminfo nla core (v2) · 5d2e5f27

由 Pavel Emelyanov 提交于 12月 30, 2011

Add a routine that dumps memory-related values of a socket.
It's made as an array to make it possible to add more stuff
here later without breaking compatibility.

Since v1: The SK_MEMINFO_ constants are in userspace
visible part of sock_diag.h, the rest is under __KERNEL__.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d2e5f27

30 12月, 2011 12 次提交

tipc: rename struct bearer_name to struct tipc_bearer_names · f19765f4

由 Paul Gortmaker 提交于 12月 29, 2011

The addition of the "s" to indicate pluralization is intentional,
since the struct actually contains two name variants.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

f19765f4

tipc: rename struct link* to struct tipc_link* · a18c4bc3

由 Paul Gortmaker 提交于 12月 29, 2011

This converts the following:

	struct link		->	struct tipc_link
	struct link_req		->	struct tipc_link_req
	struct link_name	->	struct tipc_link_name
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

a18c4bc3

tipc: rename struct bcbearer* to tipc_bcbearer* · 7f9ab6ac

由 Paul Gortmaker 提交于 12月 29, 2011

This changes both the struct bcbearer and struct bcbearer_pair to
have the "tipc_" prefix. Runtime behaviour is unchanged.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

7f9ab6ac

tipc: rename struct bclink to struct tipc_bclink · 6765fd67

由 Paul Gortmaker 提交于 12月 29, 2011

Make this rename so that it is consistent with the majority
of the other tipc structs and to assist in removing any
ambiguity with other similar names in other subsystems.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

6765fd67

tipc: rename struct subscriber to struct tipc_subscriber · 11f99906

由 Paul Gortmaker 提交于 12月 29, 2011

Make this rename so that it is consistent with the majority
of the other tipc structs and to assist in removing any
ambiguity with other similar names in other subsystems.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

11f99906

tipc: rename struct subscription to struct tipc_subscription · fead3909

由 Paul Gortmaker 提交于 12月 29, 2011

Make this rename so that it is consistent with the majority
of the other tipc structs and to assist in removing any
ambiguity with other similar names in other subsystems.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

fead3909

tipc: rename struct port_list to struct tipc_port_list · 4584310b

由 Paul Gortmaker 提交于 12月 29, 2011

Make this rename so that it is consistent with the majority
of the other tipc structs and to assist in removing any
ambiguity with other similar names in other subsystems.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

4584310b

tipc: rename struct media to struct tipc_media · 358a0d1c

由 Paul Gortmaker 提交于 12月 29, 2011

Give it a meaningful prefix, as suggested by DaveM, so that it
is consistent with things like struct tipc_bearer, and so it isn't
confused with anything else.  This has no impact on the actual
runtime code behaviour.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

358a0d1c

ipv6: Fix neigh lookup using NULL device. · 8ade06c6

由 David S. Miller 提交于 12月 29, 2011

In some of the rt6_bind_neighbour() call sites, it hasn't hooked
up the rt->dst.dev pointer yet, so we'd deref a NULL pointer when
obtaining dev->ifindex for the neighbour hash function computation.

Just pass the netdevice explicitly in to fix this problem.
Reported-by: NBjarke Istrup Pedersen <gurligebis@gentoo.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ade06c6

D
ipv6: Report TCP timetstamp info in cacheinfo just like ipv4 does. · 346f870b
由 David S. Miller 提交于 12月 29, 2011
```
I missed this while adding ipv6 support to inet_peer.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
346f870b

sch_tbf: report backlog information · b0460e44

由 Eric Dumazet 提交于 12月 28, 2011

Provide child qdisc backlog (byte count) information so that "tc -s
qdisc" can report it to user.

qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms  10.0ms
 Sent 948517 bytes 898 pkt (dropped 0, overlimits 0 requeues 1)
 rate 175056bit 16pps backlog 114b 1p requeues 1
qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us
 Sent 948517 bytes 898 pkt (dropped 15, overlimits 611 requeues 0)
 backlog 18168b 12p requeues 0
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0460e44

netfilter: Kconfig: fix unmet xt_nfacct dependencies · bc94b521

由 Pablo Neira Ayuso 提交于 12月 28, 2011

warning: (NETFILTER_XT_MATCH_NFACCT) selects NETFILTER_NETLINK_ACCT which has
unmet direct dependencies (NET && INET && NETFILTER && NETFILTER_ADVANCED)

and then

ERROR: "nfnetlink_subsys_unregister" [net/netfilter/nfnetlink_acct.ko] undefined!
ERROR: "nfnetlink_subsys_register" [net/netfilter/nfnetlink_acct.ko] undefined!
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc94b521

29 12月, 2011 7 次提交

ipv6: Kill rt6i_dev and rt6i_expires defines. · d1918542

由 David S. Miller 提交于 12月 28, 2011

It just obscures that the netdevice pointer and the expires value are
implemented in the dst_entry sub-object of the ipv6 route.

And it makes grepping for dst_entry member uses much harder too.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1918542

ipv6: Create fast inline ipv6 neigh lookup just like ipv4. · f83c7790

由 David S. Miller 提交于 12月 28, 2011

Also, create and use an rt6_bind_neighbour() in net/ipv6/route.c to
consolidate some common logic.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f83c7790

ipv6: Use universal hash for NDISC. · 2c2aba6c

由 David S. Miller 提交于 12月 28, 2011

In order to perform a proper universal hash on a vector of integers,
we have to use different universal hashes on each vector element.

Which means we need 4 different hash randoms for ipv6.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c2aba6c

netrom: avoid overflows in nr_setsockopt() · 32288eb4

由 Xi Wang 提交于 12月 27, 2011

Check setsockopt arguments to avoid overflows and return -EINVAL for
too large arguments.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32288eb4

ax25: avoid overflows in ax25_setsockopt() · ba1cffe0

由 Xi Wang 提交于 12月 27, 2011

Commit be639ac6 ("NET: AX.25: Check ioctl arguments to avoid overflows
further down the road") rejects very large arguments, but doesn't
completely fix overflows on 64-bit systems.  Consider the AX25_T2 case.

	int opt;
	...
	if (opt < 1 || opt > ULONG_MAX / HZ) {
		res = -EINVAL;
		break;
	}
	ax25->t2 = opt * HZ;

The 32-bit multiplication opt * HZ would overflow before being assigned
to 64-bit ax25->t2.  This patch changes "opt" to unsigned long.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba1cffe0

genetlink: add auto module loading · fa843095

由 Stephen Hemminger 提交于 12月 28, 2011

When testing L2TP support, I discovered that the l2tp module is not autoloaded
as are other netlink interfaces. There is because of lack of hook in genetlink to call
request_module and load the module.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa843095

ipv6: Remove optimistic DAD flag test in ipv6_add_addr() · 7ffbcecb

由 David Miller 提交于 12月 27, 2011

The route we have here is for the address being added to the interface,
ie. for input packet processing.

Therefore using that route to determine whether an output nexthop gateway
is known and resolved doesn't make any sense.

So, simply remove this test, it never triggered anyways.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-By: NNeil Horman <nhorman@tuxdriver.com>

7ffbcecb

28 12月, 2011 5 次提交

packet: fix possible dev refcnt leak when bind fail · aef950b4

由 Wei Yongjun 提交于 12月 27, 2011

If bind is fail when bind is called after set PACKET_FANOUT
sock option, the dev refcnt will leak.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aef950b4

netfilter: provide config option to disable ancient procfs parts · 54b07dca

由 Jan Engelhardt 提交于 4月 21, 2011

Using /proc/net/nf_conntrack has been deprecated in favour of the
conntrack(8) tool.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

54b07dca

netfilter: xtables: collapse conditions in xt_ecn · 42c344a3

由 Jan Engelhardt 提交于 6月 09, 2011

One simplification of an if clause.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

42c344a3

netfilter: xtables: add an IPv6 capable version of the ECN match · af0d29cd

由 Patrick McHardy 提交于 6月 09, 2011

References: http://www.spinics.net/lists/netfilter-devel/msg18875.html

Augment xt_ecn by facilities to match on IPv6 packets' DSCP/TOS field
similar to how it is already done for the IPv4 packet field.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

af0d29cd

netfilter: xtables: give xt_ecn its own name · a4c6f9d3

由 Jan Engelhardt 提交于 6月 09, 2011

Use the new macro and struct names in xt_ecn.h, and put the old
definitions into a definition-forwarding ipt_ecn.h.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a4c6f9d3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功