提交 · ae90bdeaeac6b964b7a1e853a90a19f358a9ac20 · openeuler / raspberrypi-kernel

16 11月, 2010 2 次提交

netfilter: nf_nat_amanda: rename a variable · ab0cba25

由 Eric Dumazet 提交于 11月 15, 2010

Avoid a sparse warning about 'ret' variable shadowing
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

ab0cba25

netfilter: add __rcu annotations · eb733162

由 Eric Dumazet 提交于 11月 15, 2010

Use helpers to reduce number of sparse warnings
(CONFIG_SPARSE_RCU_POINTER=y)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

eb733162

15 11月, 2010 2 次提交

netfilter: nf_nat: don't use atomic bit operation · 76a2d3bc

由 Changli Gao 提交于 11月 15, 2010

As we own the conntrack and the others can't see it until we confirm it,
we don't need to use atomic bit operation on ct->status.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

76a2d3bc

netfilter: xt_LOG: do print MAC header on FORWARD · b468645d

由 Jan Engelhardt 提交于 11月 15, 2010

I am observing consistent behavior even with bridges, so let's unlock
this. xt_mac is already usable in FORWARD, too. Section 9 of
http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html#section9 says
the MAC source address is changed, but my observation does not match
that claim -- the MAC header is retained.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
[Patrick; code inspection seems to confirm this]
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b468645d

12 11月, 2010 2 次提交

ipv4: Make rt->fl.iif tests lest obscure. · c7537967

由 David S. Miller 提交于 11月 11, 2010

When we test rt->fl.iif against zero, we're seeing if it's
an output or an input route.

Make that explicit with some helper functions.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7537967

net: get rid of rtable->idev · 72cdd1d9

由 Eric Dumazet 提交于 11月 11, 2010

It seems idev field in struct rtable has no special purpose, but adding
extra atomic ops.

We hold refcounts on the device itself (using percpu data, so pretty
cheap in current kernel).

infiniband case is solved using dst.dev instead of idev->dev

Removal of this field means routing without route cache is now using
shared data, percpu data, and only potential contention is a pair of
atomic ops on struct neighbour per forwarded packet.

About 5% speedup on routing test.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72cdd1d9

10 11月, 2010 1 次提交

net/ipv4/tcp.c: Update WARN uses · 2af6fd8b

由 Joe Perches 提交于 10月 30, 2010

Coalesce long formats.
Align arguments.
Remove KERN_<level>.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2af6fd8b

05 11月, 2010 2 次提交

inet_diag: Make sure we actually run the same bytecode we audited. · 22e76c84

由 Nelson Elhage 提交于 11月 03, 2010

We were using nlmsg_find_attr() to look up the bytecode by attribute when
auditing, but then just using the first attribute when actually running
bytecode. So, if we received a message with two attribute elements, where only
the second had type INET_DIAG_REQ_BYTECODE, we would validate and run different
bytecode strings.

Fix this by consistently using nlmsg_find_attr everywhere.
Signed-off-by: NNelson Elhage <nelhage@ksplice.com>
Signed-off-by: NThomas Graf <tgraf@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22e76c84

fib: fib_result_assign() should not change fib refcounts · 1f1b9c99

由 Eric Dumazet 提交于 11月 04, 2010

After commit ebc0ffae (RCU conversion of fib_lookup()),
fib_result_assign()  should not change fib refcounts anymore.

Thanks to Michael who did the bisection and bug report.
Reported-by: NMichael Ellerman <michael@ellerman.id.au>
Tested-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f1b9c99

03 11月, 2010 2 次提交

ipv4: netfilter: ip_tables: fix information leak to userland · b5f15ac4

由 Vasiliy Kulikov 提交于 11月 03, 2010

Structure ipt_getinfo is copied to userland with the field "name"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.
Signed-off-by: NVasiliy Kulikov <segooon@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b5f15ac4

ipv4: netfilter: arp_tables: fix information leak to userland · 1a8b7a67

由 Vasiliy Kulikov 提交于 11月 03, 2010

Structure arpt_getinfo is copied to userland with the field "name"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.
Signed-off-by: NVasiliy Kulikov <segooon@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

1a8b7a67

31 10月, 2010 1 次提交

ip_gre: fix fallback tunnel setup · 3285ee3b

由 Eric Dumazet 提交于 10月 30, 2010

Before making the fallback tunnel visible to lookups, we should make
sure it is completely setup, once ipgre_tunnel_init() had been called
and tstats per_cpu pointer allocated.

move rcu_assign_pointer(ign->tunnels_wc[0], tunnel); from
ipgre_fb_tunnel_init() to ipgre_init_net()

Based on a patch from Pavel Emelyanov
Reported-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3285ee3b

29 10月, 2010 2 次提交

netfilter: nf_nat: fix compiler warning with CONFIG_NF_CT_NETLINK=n · 64e46749

由 Patrick McHardy 提交于 10月 29, 2010

net/ipv4/netfilter/nf_nat_core.c:52: warning: 'nf_nat_proto_find_get' defined but not used
net/ipv4/netfilter/nf_nat_core.c:66: warning: 'nf_nat_proto_put' defined but not used
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

64e46749

fib: Fix fib zone and its hash leak on namespace stop · 4aa2c466

由 Pavel Emelyanov 提交于 10月 28, 2010

When we stop a namespace we flush the table and free one, but the
added fn_zone-s (and their hashes if grown) are leaked. Need to free.
Tries releases all its stuff in the flushing code.

Shame on us - this bug exists since the very first make-fib-per-net
patches in 2.6.27 :(
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4aa2c466

28 10月, 2010 5 次提交

tunnels: Fix tunnels change rcu protection · 74b0b85b

由 Pavel Emelyanov 提交于 10月 27, 2010

After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug
was introduced into the SIOCCHGTUNNEL code.

The tunnel is first unlinked, then addresses change, then it is linked
back probably into another bucket. But while changing the parms, the
hash table is unlocked to readers and they can lookup the improper tunnel.

Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b
(gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and
94767632 (ip6tnl: get rid of ip6_tnl_lock).

The quick fix is to wait for quiescent state to pass after unlinking,
but if it is inappropriate I can invent something better, just let me
know.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74b0b85b

inetpeer: __rcu annotations · b914c4ea

由 Eric Dumazet 提交于 10月 25, 2010

Adds __rcu annotations to inetpeer
	(struct inet_peer)->avl_left
	(struct inet_peer)->avl_right

This is a tedious cleanup, but removes one smp_wmb() from link_to_pool()
since we now use more self documenting rcu_assign_pointer().

Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in
all cases we dont need a memory barrier.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b914c4ea

tunnels: add __rcu annotations · b33eab08

由 Eric Dumazet 提交于 10月 25, 2010

Add __rcu annotations to :
        (struct ip_tunnel)->prl
        (struct ip_tunnel_prl_entry)->next
        (struct xfrm_tunnel)->next
	struct xfrm_tunnel *tunnel4_handlers
	struct xfrm_tunnel *tunnel64_handlers

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b33eab08

net: add __rcu annotations to protocol · e0ad61ec

由 Eric Dumazet 提交于 10月 25, 2010

Add __rcu annotations to :
        struct net_protocol *inet_protos
        struct net_protocol *inet6_protos

And use appropriate casts to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0ad61ec

ipv4: add __rcu annotations to routes.c · 1c31720a

由 Eric Dumazet 提交于 10月 25, 2010

Add __rcu annotations to :
        (struct dst_entry)->rt_next
        (struct rt_hash_bucket)->chain

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c31720a

27 10月, 2010 1 次提交

fib_hash: fix rcu sparse and logical errors · ded85aa8

由 Eric Dumazet 提交于 10月 26, 2010

While fixing CONFIG_SPARSE_RCU_POINTER errors, I had to fix accesses to
fz->fz_hash for real.

-	&fz->fz_hash[fn_hash(f->fn_key, fz)]
+	rcu_dereference(fz->fz_hash) + fn_hash(f->fn_key, fz)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ded85aa8

26 10月, 2010 3 次提交

ipv4: add __rcu annotations to ip_ra_chain · 43a951e9

由 Eric Dumazet 提交于 10月 25, 2010

Add __rcu annotations to :
        (struct ip_ra_chain)->next
	struct ip_ra_chain *ip_ra_chain;

And use appropriate rcu primitives.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43a951e9

net: add __rcu annotation to sk_filter · 0d7da9dd

由 Eric Dumazet 提交于 10月 25, 2010

Add __rcu annotation to :
        (struct sock)->sk_filter

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d7da9dd

tunnels: add _rcu annotations · 6f0bcf15

由 Eric Dumazet 提交于 10月 24, 2010

(struct ip6_tnl)->next is rcu protected :
(struct ip_tunnel)->next is rcu protected :
(struct xfrm6_tunnel)->next is rcu protected :

add __rcu annotation and proper rcu primitives.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f0bcf15

21 10月, 2010 4 次提交

nf_nat: restrict ICMP translation for embedded header · b0aeef30

由 Julian Anastasov 提交于 10月 11, 2010

 	Skip ICMP translation of embedded protocol header
if NAT bits are not set. Needed for IPVS to see the original
embedded addresses because for IPVS traffic the IPS_SRC_NAT_BIT
and IPS_DST_NAT_BIT bits are not set. It happens when IPVS performs
DNAT for client packets after using nf_conntrack_alter_reply
to expect replies from real server.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

b0aeef30

tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() · 093d2823

由 Balazs Scheidler 提交于 10月 21, 2010

When __inet_inherit_port() is called on a tproxy connection the wrong locks are
held for the inet_bind_bucket it is added to. __inet_inherit_port() made an
implicit assumption that the listener's port number (and thus its bind bucket).
Unfortunately, if you're using the TPROXY target to redirect skbs to a
transparent proxy that assumption is not true anymore and things break.

This patch adds code to __inet_inherit_port() so that it can handle this case
by looking up or creating a new bind bucket for the child socket and updates
callers of __inet_inherit_port() to gracefully handle __inet_inherit_port()
failing.

Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>.
See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion.
Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

093d2823

fib: introduce fib_alias_accessed() helper · 9b0c290e

由 Eric Dumazet 提交于 10月 20, 2010

Perf tools session at NFWS 2010 pointed out a false sharing on struct
fib_alias that can be avoided pretty easily, if we set FA_S_ACCESSED bit
only if needed (ie : not already set)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b0c290e

secmark: export secctx, drop secmark in procfs · 1ae4de0c

由 Eric Paris 提交于 10月 13, 2010

The current secmark code exports a secmark= field which just indicates if
there is special labeling on a packet or not.  We drop this field as it
isn't particularly useful and instead export a new field secctx= which is
the actual human readable text label.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NJames Morris <jmorris@namei.org>

1ae4de0c

20 10月, 2010 1 次提交

net: avoid RCU for NOCACHE dst · 27b75c95

由 Eric Dumazet 提交于 10月 15, 2010

There is no point using RCU for dst we allocate for a very short time
(used once).

Change dst_release() to take DST_NOCACHE into account, but also change
skb_dst_set_noref() to force a refcount increment for such dst.

This is a _huge_ gain, because we dont waste memory to store xx thousand
of dsts. Instead of queueing them to RCU, we can free them instantly.

CPU caches can stay hot, re-using same memory blocks to hold temporary
dsts.

Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(),
since atomic_dec_return() implies a full memory barrier.

Stress test, 160.000.000 udp frames sent, IP route cache disabled
(DDOS).

Before:

real    0m38.091s
user    0m13.189s
sys     7m53.018s

After:

real	0m29.946s
user	0m12.157s
sys	7m40.605s

For reference, if IP route cache was enabled :

real	0m32.030s
user	0m10.521s
sys	8m15.243s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27b75c95

19 10月, 2010 2 次提交

inet: RCU changes in inetdev_by_index() · 8723e1b4

由 Eric Dumazet 提交于 10月 19, 2010

Convert inetdev_by_index() to not increment in_dev refcount.

Callers hold RCU or RTNL, and should not decrement in_dev refcount.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8723e1b4

net: avoid a dev refcount in ip_mc_find_dev() · 9e917dca

由 Eric Dumazet 提交于 10月 19, 2010

We hold RTNL in ip_mc_find_dev(), no need to touch device refcount.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e917dca

18 10月, 2010 8 次提交

IPv4: route.c: Change checks against 0xffffffff to ipv4_is_lbcast() · 27a954bd

由 Andy Walls 提交于 10月 17, 2010

Change a few checks against the hardcoded broadcast address,
0xffffffff, to ipv4_is_lbcast().  Remove some existing checks
using ipv4_is_lbcast() that are now obviously superfluous.
Signed-off-by: NAndy Walls <awalls@md.metrocast.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27a954bd

netfilter: fix kconfig unmet dependency warning · 76b6717b

由 Randy Dunlap 提交于 10月 18, 2010

Fix netfilter kconfig unmet dependencies warning & spell out
"compatible" while there.

warning: (IP_NF_TARGET_TTL && NET && INET && NETFILTER && IP_NF_IPTABLES && NETFILTER_ADVANCED || IP6_NF_TARGET_HL && NET && INET && IPV6 && NETFILTER && IP6_NF_IPTABLES && NETFILTER_ADVANCED) selects NETFILTER_XT_TARGET_HL which has unmet direct dependencies ((IP_NF_MANGLE || IP6_NF_MANGLE) && NETFILTER_ADVANCED)
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

76b6717b

Update broken web addresses in the kernel. · 631dd1a8

由 Justin P. Mattock 提交于 10月 18, 2010

The patch below updates broken web addresses in the kernel
Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: NBen Pfaff <blp@cs.stanford.edu>
Acked-by: NHans J. Koch <hjk@linutronix.de>
Reviewed-by: NFinn Thain <fthain@telegraphics.com.au>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

631dd1a8

fib_hash: RCU conversion phase 2 · 19f57256

由 Eric Dumazet 提交于 10月 14, 2010

Get rid of fib_hash_lock rwlock.

The fn_zone hash table resize is the noticeable part of this patch.

I added a seqlock per fn_zone, so that readers can restart their lookup
in the (very rare) case a writer expanded the hash table.

Add rcu heads in fib_alias and fib_node, use call_rcu() to defer their
freeing, and use appropriate _rcu list manipulations.

Stress test (160.000.000 udp frames sent, IP route cache disabled to
mimic DDOS attack, FIB_HASH)

Before:
real	0m41.191s
user	0m13.137s
sys	8m55.241s

After:
real	0m38.091s
user	0m13.189s
sys	7m53.018s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19f57256

fib_hash: RCU conversion phase 1 · 117a8cde

由 Eric Dumazet 提交于 10月 14, 2010

First step for RCU conversion of fib_hash :

struct fn_zone are created and never deleted.

Very classic conversion, using rcu_assign_pointer(), rcu_dereference()
and rtnl_dereference() verbs.

__rcu markers on fz_next and fn_zone_list

They are created under RTNL, we dont need fib_hash_lock anymore in
fn_new_zone().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117a8cde

fib_hash: embed initial hash table in fn_zone · 9bef83ed

由 Eric Dumazet 提交于 10月 14, 2010

While looking for false sharing problems, I noticed
sizeof(struct fn_zone) was small (28 bytes) and possibly sharing a cache
line with an often written kernel structure.

Most of the time, fn_zone uses its initial hash table of 16 slots.

We can avoid the false sharing problem by embedding this initial hash
table in fn_zone itself, so that sizeof(fn_zone) > L1_CACHE_BYTES

We did a similar optimization in commit a6501e08 (Reduce memory needs
and speedup lookups)

Add a fz_revorder field to speedup fn_hash() a bit.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bef83ed

tcp: use correct counters in CA_CWR state too · c60ce4e2

由 Ilpo Järvinen 提交于 10月 14, 2010

As CWR is stronger than CA_Disorder state, we can miscount
SACK/Reno failure into other timeouts. Not a bad problem as
it can happen only due to ECN, FRTO detecting spurious RTO
or xmit error which are the only callers of tcp_enter_cwr.
And even then losses and RTO must still follow thereafter
to actually end up into the relevant code paths.

Compile tested.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c60ce4e2

tcp: sack lost marking fixes · 1fdb9361

由 Ilpo Järvinen 提交于 10月 14, 2010

When only fast rexmit should be done, tcp_mark_head_lost marks
L too far. Also, sacked_upto below 1 is perfectly valid number,
the packets == 0 then needs to be trapped elsewhere.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fdb9361

17 10月, 2010 2 次提交

fib: avoid false sharing on fib_table_hash · 10da66f7

由 Eric Dumazet 提交于 10月 13, 2010

While doing profile analysis, I found fib_hash_table was sometime in a
cache line shared by a possibly often written kernel structure.

(CONFIG_IP_ROUTE_MULTIPATH || !CONFIG_IPV6_MULTIPLE_TABLES)

It's hard to detect because not easily reproductible.

Make sure we allocate a full cache line to keep this shared in all cpus
caches.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10da66f7

fib_trie: use fls() instead of open coded loop · 874ffa8f

由 Eric Dumazet 提交于 10月 13, 2010

fib_table_lookup() might use fls() to speedup an open coded loop.

Noticed while doing a profile analysis.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

874ffa8f