提交 · 0f6efff92524c65fc3ef41c8b936c526580f1db0 · OpenHarmony / kernel_linux

16 6月, 2012 1 次提交

由 Eric Dumazet 提交于 6月 14, 2012

Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)

We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.

Reverts commits :
fc6055a5 net: Introduce skb_orphan_try()
87fd308c net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag
Reported-and-bisected-by: NJean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62b1a8ab

14 6月, 2012 1 次提交

netpoll: fix netpoll_send_udp() bugs · 954fba02

由 Eric Dumazet 提交于 6月 12, 2012

Bogdan Hamciuc diagnosed and fixed following bug in netpoll_send_udp() :

"skb->len += len;" instead of "skb_put(skb, len);"

Meaning that _if_ a network driver needs to call skb_realloc_headroom(),
only packet headers would be copied, leaving garbage in the payload.

However the skb_realloc_headroom() must be avoided as much as possible
since it requires memory and netpoll tries hard to work even if memory
is exhausted (using a pool of preallocated skbs)

It appears netpoll_send_udp() reserved 16 bytes for the ethernet header,
which happens to work for typicall drivers but not all.

Right thing is to use LL_RESERVED_SPACE(dev)
(And also add dev->needed_tailroom of tailroom)

This patch combines both fixes.

Many thanks to Bogdan for raising this issue.
Reported-by: NBogdan Hamciuc <bogdan.hamciuc@freescale.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NBogdan Hamciuc <bogdan.hamciuc@freescale.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

954fba02

09 6月, 2012 4 次提交

net/core: fix kernel-doc warnings · c6c4b97c

由 Randy Dunlap 提交于 6月 08, 2012

Fix kernel-doc warnings in net/core:

Warning(net/core/skbuff.c:3368): No description found for parameter 'delta_truesize'
Warning(net/core/filter.c:628): No description found for parameter 'pfp'
Warning(net/core/filter.c:628): Excess function parameter 'sk' description in 'sk_unattached_filter_create'
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6c4b97c

l2tp: fix a race in l2tp_ip_sendmsg() · 4399a4df

由 Eric Dumazet 提交于 6月 08, 2012

Commit 081b1b1b (l2tp: fix l2tp_ip_sendmsg() route handling) added
a race, in case IP route cache is disabled.

In this case, we should not do the dst_release(&rt->dst), since it'll
free the dst immediately, instead of waiting a RCU grace period.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Cc: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4399a4df

mac80211: add back channel change flag · 6aee4ca3

由 Stanislaw Gruszka 提交于 6月 07, 2012

commit 24398e39
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Mar 28 10:58:36 2012 +0200

    mac80211: set HT channel before association

removed IEEE80211_CONF_CHANGE_CHANNEL argument from ieee80211_hw_config,
which is required by iwl4965 driver, otherwise that driver does not
configure channel properly and is not able to associate.
Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

6aee4ca3

NFC: Fix possible NULL ptr deref when getting the name of a socket · 58d1eab7

由 Sasha Levin 提交于 6月 06, 2012

llcp_sock_getname() might get called before the LLCP socket was created.
This condition isn't checked, and llcp_sock_getname will simply deref a
NULL ptr in that case.

This exists starting with d646960f ("NFC: Initial LLCP support").
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

58d1eab7

08 6月, 2012 5 次提交

snmp: fix OutOctets counter to include forwarded datagrams · 2d8dbb04

由 Vincent Bernat 提交于 6月 05, 2012

RFC 4293 defines ipIfStatsOutOctets (similar definition for
ipSystemStatsOutOctets):

   The total number of octets in IP datagrams delivered to the lower
   layers for transmission.  Octets from datagrams counted in
   ipIfStatsOutTransmits MUST be counted here.

And ipIfStatsOutTransmits:

   The total number of IP datagrams that this entity supplied to the
   lower layers for transmission.  This includes datagrams generated
   locally and those forwarded by this entity.

Therefore, IPSTATS_MIB_OUTOCTETS must be incremented when incrementing
IPSTATS_MIB_OUTFORWDATAGRAMS.

IP_UPD_PO_STATS is not used since ipIfStatsOutRequests must not
include forwarded datagrams:

   The total number of IP datagrams that local IP user-protocols
   (including ICMP) supplied to IP in requests for transmission.  Note
   that this counter does not include any datagrams counted in
   ipIfStatsOutForwDatagrams.
Signed-off-by: NVincent Bernat <bernat@luffy.cx>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d8dbb04

appletalk: Remove out of date message in printk · 278f015e

由 Dave Jones 提交于 6月 06, 2012

I accidentally triggered this printk, which amused me for a few moments.
Given we're post 2.2, we could just -EACCES, but does anyone even care about Appletalk now ?
I figure it's better to leave sleeping dogs lie, and just update the message.
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

278f015e

ipv6: fib: Restore NTF_ROUTER exception in fib6_age() · 8bd74516

由 Thomas Graf 提交于 6月 07, 2012

Commit 5339ab8b (ipv6: fib: Convert fib6_age() to
dst_neigh_lookup().) seems to have mistakenly inverted the
exception for cached NTF_ROUTER routes.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bd74516

net: neighbour: fix neigh_dump_info() · 4bd6683b

由 Eric Dumazet 提交于 6月 07, 2012

Denys found out "ip neigh" output was truncated to
about 54 neighbours.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bd6683b

net: l2tp_eth: fix kernel panic on rmmod l2tp_eth · a06998b8

由 Eric Dumazet 提交于 6月 07, 2012

We must prevent module unloading if some devices are still attached to
l2tp_eth driver.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Tested-by: NDenys Fedoryshchenko <denys@visp.net.lb>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a06998b8

07 6月, 2012 3 次提交

netfilter: nf_ct_h323: fix bug in rtcp natting · d109e9af

由 Pablo Neira Ayuso 提交于 6月 04, 2012

The nat_rtp_rtcp hook takes two separate parameters port and rtp_port.

port is expected to be the real h245 address (found inside the packet).
rtp_port is the even number closest to port (RTP ports are even and
RTCP ports are odd).

However currently, both port and rtp_port are having same value (both are
rounded to nearest even numbers).

This works well in case of openlogicalchannel with media (RTP/even) port.

But in case of openlogicalchannel for media control (RTCP/odd) port,
h245 address in the packet is wrongly modified to have an even port.

I am attaching a pcap demonstrating the problem, for any further analysis.

This behavior was introduced around v2.6.19 while rewriting the helper.
Signed-off-by: NJagdish Motwani <jagdish.motwani@elitecore.com>
Signed-off-by: NSanket Shah <sanket.shah@elitecore.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d109e9af

netfilter: xt_HMARK: fix endianness and provide consistent hashing · d1992b16

由 Hans Schillstrom 提交于 5月 17, 2012

This patch addresses two issues:

a) Fix usage of u32 and __be32 that causes endianess warnings via sparse.
b) Ensure consistent hashing in a cluster that is composed of big and
   little endian systems. Thus, we obtain the same hash mark in an
   heterogeneous cluster.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NHans Schillstrom <hans@schillstrom.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d1992b16

inetpeer: fix a race in inetpeer_gc_worker() · 55432d2b

由 Eric Dumazet 提交于 6月 05, 2012

commit 5faa5df1 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :

Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.

inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55432d2b

06 6月, 2012 1 次提交

cfg80211: fix interface combinations check · 463454b5

由 Johannes Berg 提交于 6月 05, 2012

If a given interface combination doesn't contain
a required interface type then we missed checking
that and erroneously allowed it even though iface
type wasn't there at all. Add a check that makes
sure that all interface types are accounted for.

Cc: stable@kernel.org
Reported-by: NMohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

463454b5

05 6月, 2012 9 次提交

Bluetooth: Fix checking the wrong flag when accepting a socket · ddcd0f41

由 Vinicius Costa Gomes 提交于 5月 31, 2012

Most probably a typo, the check should have been for BT_SK_DEFER_SETUP
instead of BT_DEFER_SETUP (which right now only represents a socket
option).
Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@openbossa.org>
Acked-by: NAndrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk>

ddcd0f41

mac80211: fix non RCU-safe sta_list manipulation · 794454ce

由 Arik Nemtsov 提交于 6月 03, 2012

sta_info_cleanup locks the sta_list using rcu_read_lock however
the delete operation isn't rcu safe. A race between sta_info_cleanup
timer being called and a STA being removed can occur which leads
to a panic while traversing sta_list. Fix this by switching to the
RCU-safe versions.

Cc: stable@vger.kernel.org
Reported-by: NEyal Shapira <eyal@wizery.com>
Signed-off-by: NArik Nemtsov <arik@wizery.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

794454ce

mac80211: Fix likely misuse of | for & · 5204267d

由 Joe Perches 提交于 5月 30, 2012

Using | with a constant is always true.
Likely this should have be &.

cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

5204267d

mac80211: add missing rcu_read_lock/unlock in agg-rx session timer · d8c7aae6

由 Felix Fietkau 提交于 5月 30, 2012

Fixes a lockdep warning:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
net/mac80211/agg-rx.c:148 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by arecord/11226:
 #0:  (&tid_agg_rx->session_timer){+.-...}, at: [<ffffffff81066bb0>] call_timer_fn+0x0/0x360

stack backtrace:
Pid: 11226, comm: arecord Not tainted 3.1.0-kml #16
Call Trace:
 <IRQ>  [<ffffffff81093454>] lockdep_rcu_dereference+0xa4/0xc0
 [<ffffffffa02778c9>] sta_rx_agg_session_timer_expired+0xc9/0x110 [mac80211]
 [<ffffffffa0277800>] ? ieee80211_process_addba_resp+0x220/0x220 [mac80211]
 [<ffffffff81066c3a>] call_timer_fn+0x8a/0x360
 [<ffffffff81066bb0>] ? init_timer_deferrable_key+0x30/0x30
 [<ffffffff81477bb0>] ? _raw_spin_unlock_irq+0x30/0x70
 [<ffffffff81067049>] run_timer_softirq+0x139/0x310
 [<ffffffff81091d5e>] ? put_lock_stats.isra.25+0xe/0x40
 [<ffffffff810922ac>] ? lock_release_holdtime.part.26+0xdc/0x160
 [<ffffffffa0277800>] ? ieee80211_process_addba_resp+0x220/0x220 [mac80211]
 [<ffffffff8105cb78>] __do_softirq+0xc8/0x3c0
 [<ffffffff8108f088>] ? tick_dev_program_event+0x48/0x110
 [<ffffffff8108f16f>] ? tick_program_event+0x1f/0x30
 [<ffffffff81153b15>] ? putname+0x35/0x50
 [<ffffffff8147a43c>] call_softirq+0x1c/0x30
 [<ffffffff81004c55>] do_softirq+0xa5/0xe0
 [<ffffffff8105d1ee>] irq_exit+0xae/0xe0
 [<ffffffff8147ac6b>] smp_apic_timer_interrupt+0x6b/0x98
 [<ffffffff81479ab3>] apic_timer_interrupt+0x73/0x80
 <EOI>  [<ffffffff8146aac6>] ? free_debug_processing+0x1a1/0x1d5
 [<ffffffff81153b15>] ? putname+0x35/0x50
 [<ffffffff8146ab2b>] __slab_free+0x31/0x2ca
 [<ffffffff81477c3a>] ? _raw_spin_unlock_irqrestore+0x4a/0x90
 [<ffffffff81253b8f>] ? __debug_check_no_obj_freed+0x15f/0x210
 [<ffffffff81097054>] ? lock_release_nested+0x84/0xc0
 [<ffffffff8113ec55>] ? kmem_cache_free+0x105/0x250
 [<ffffffff81153b15>] ? putname+0x35/0x50
 [<ffffffff81153b15>] ? putname+0x35/0x50
 [<ffffffff8113ed8f>] kmem_cache_free+0x23f/0x250
 [<ffffffff81153b15>] putname+0x35/0x50
 [<ffffffff81146d8d>] do_sys_open+0x16d/0x1d0
 [<ffffffff81146e10>] sys_open+0x20/0x30
 [<ffffffff81478f42>] system_call_fastpath+0x16/0x1b
Reported-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

d8c7aae6

mac80211: clean up remain-on-channel on interface stop · 71ecfa18

由 Johannes Berg 提交于 5月 31, 2012

When any interface goes down, it could be the one that we
were doing a remain-on-channel with. We therefore need to
cancel the remain-on-channel and flush the related work
structs so they don't run after the interface has been
removed or even destroyed.

It's also possible in this case that an off-channel SKB
was never transmitted, so free it if this is the case.
Note that this can also happen if the driver finishes
the off-channel period without ever starting it.

Cc: stable@kernel.org
Reported-by: NNirav Shah <nirav.j2.shah@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

71ecfa18

mac80211: fix error in station state transitions during reconfig · bd34ab62

由 Meenakshi Venkataraman 提交于 5月 30, 2012

As part of hardware reconfig mac80211 tries
to restore the station state to its values
before the hardware reconfig, but it only
goes to the last-state - 1. Fix this
off-by-one error.

Cc: stable@kernel.org [3.4]
Signed-off-by: NMeenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Reviewed-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

bd34ab62

mac80211: Fix Unreachable Mesh Station Problem when joining to another MBSS · b8bacc18

由 Chun-Yeow Yeoh 提交于 5月 30, 2012

Mesh station that joins an MBSS is reachable using mesh portal with 6
address frame by mesh stations from another MBSS if these two different
MBSSes are bridged. However, if the mesh station later moves into the
same MBSS of those mesh stations, it is unreachable by mesh stations
in the MBSS due to the mpp_paths table is not deleted. A quick fix
is to perform mesh_path_lookup, if it is available for the target
destination, mpp_path_lookup is not performed. When the mesh station
moves back to its original MBSS, the mesh_paths will be deleted once
expired. So, it will be reachable using mpp_path_lookup again.
Signed-off-by: NChun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

b8bacc18

cfg80211: use sme_state in ibss start/join path · 28f33366

由 Amitkumar Karwar 提交于 5月 29, 2012

CFG80211_DEV_WARN_ON() at "net/wireless/ibss.c line 63"
is unnecessarily triggered even after successful connection,
when cfg80211_ibss_joined() is called by driver inside
.join_ibss handler.

This patch fixes the problem by changing 'sme_state' in ibss path
and having WARN_ON() check for 'sme_state' similar to infra
association.
Signed-off-by: NAmitkumar Karwar <akarwar@marvell.com>
Signed-off-by: NBing Zhao <bzhao@marvell.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

28f33366

mac80211: run scan after finish connection monitoring · 925e64c3

由 Stanislaw Gruszka 提交于 5月 16, 2012

commit 133d40f9
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Wed Mar 28 16:01:19 2012 +0200

    mac80211: do not scan and monitor connection in parallel

add bug, which make possible to start a scan and never finish it, so
make every new scanning request finish with -EBUSY error. This can
happen on code paths where we finish connection monitoring and clear
IEEE80211_STA_*_POLL flags, but do not check if scan was deferred.
This patch fixes those code paths.
Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

925e64c3

04 6月, 2012 1 次提交

drop_monitor: dont sleep in atomic context · bec4596b

由 Eric Dumazet 提交于 6月 04, 2012

drop_monitor calls several sleeping functions while in atomic context.

 BUG: sleeping function called from invalid context at mm/slub.c:943
 in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2
 Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55
 Call Trace:
  [<ffffffff810697ca>] __might_sleep+0xca/0xf0
  [<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0
  [<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130
  [<ffffffff815343fb>] __alloc_skb+0x4b/0x230
  [<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor]
  [<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor]
  [<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor]
  [<ffffffff810568e0>] process_one_work+0x130/0x4c0
  [<ffffffff81058249>] worker_thread+0x159/0x360
  [<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240
  [<ffffffff8105d403>] kthread+0x93/0xa0
  [<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10
  [<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80
  [<ffffffff816be6d0>] ? gs_change+0xb/0xb

Rework the logic to call the sleeping functions in right context.

Use standard timer/workqueue api to let system chose any cpu to perform
the allocation and netlink send.

Also avoid a loop if reset_per_cpu_data() cannot allocate memory :
use mod_timer() to wait 1/10 second before next try.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bec4596b

03 6月, 2012 1 次提交

tty: Revert the tty locking series, it needs more work · f309532b

由 Linus Torvalds 提交于 6月 02, 2012

This reverts the tty layer change to use per-tty locking, because it's
not correct yet, and fixing it will require some more deep surgery.

The main revert is d29f3ef3 ("tty_lock: Localise the lock"), but
there are several smaller commits that built upon it, they also get
reverted here. The list of reverted commits is:

  fde86d31 - tty: add lockdep annotations
  8f6576ad - tty: fix ldisc lock inversion trace
  d3ca8b64 - pty: Fix lock inversion
  b1d679af - tty: drop the pty lock during hangup
  abcefe5f - tty/amiserial: Add missing argument for tty_unlock()
  fd11b42e - cris: fix missing tty arg in wait_event_interruptible_tty call
  d29f3ef3 - tty_lock: Localise the lock

The revert had a trivial conflict in the 68360serial.c staging driver
that got removed in the meantime.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f309532b

02 6月, 2012 2 次提交

tcp: reflect SYN queue_mapping into SYNACK packets · fff32699

由 Eric Dumazet 提交于 6月 01, 2012

While testing how linux behaves on SYNFLOOD attack on multiqueue device
(ixgbe), I found that SYNACK messages were dropped at Qdisc level
because we send them all on a single queue.

Obvious choice is to reflect incoming SYN packet @queue_mapping to
SYNACK packet.

Under stress, my machine could only send 25.000 SYNACK per second (for
200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues.

After patch, not a single SYNACK is dropped.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fff32699

tcp: do not create inetpeer on SYNACK message · 7433819a

由 Eric Dumazet 提交于 5月 31, 2012

Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting
larger and larger, using lots of memory and cpu time.

tcp_v4_send_synack()
->inet_csk_route_req()
 ->ip_route_output_flow()
  ->rt_set_nexthop()
   ->rt_init_metrics()
    ->inet_getpeer( create = true)

This is a side effect of commit a4daad6b (net: Pre-COW metrics for
TCP) added in 2.6.39

Possible solution :

Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS

Before patch :

# grep peer /proc/slabinfo
inet_peer_cache   4175430 4175430    192   42    2 : tunables    0    0    0 : slabdata  99415  99415      0

Samples: 41K of event 'cycles', Event count (approx.): 30716565122
+  20,24%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_getpeer
+   8,19%      ksoftirqd/0  [kernel.kallsyms]           [k] peer_avl_rebalance.isra.1
+   4,81%      ksoftirqd/0  [kernel.kallsyms]           [k] sha_transform
+   3,64%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_table_lookup
+   2,36%      ksoftirqd/0  [ixgbe]                     [k] ixgbe_poll
+   2,16%      ksoftirqd/0  [kernel.kallsyms]           [k] __ip_route_output_key
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] kernel_map_pages
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] ip_route_input_common
+   2,01%      ksoftirqd/0  [kernel.kallsyms]           [k] __inet_lookup_established
+   1,83%      ksoftirqd/0  [kernel.kallsyms]           [k] md5_transform
+   1,75%      ksoftirqd/0  [kernel.kallsyms]           [k] check_leaf.isra.9
+   1,49%      ksoftirqd/0  [kernel.kallsyms]           [k] ipt_do_table
+   1,46%      ksoftirqd/0  [kernel.kallsyms]           [k] hrtimer_interrupt
+   1,45%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_alloc
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_csk_search_req
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] __netif_receive_skb
+   1,16%      ksoftirqd/0  [kernel.kallsyms]           [k] copy_user_generic_string
+   1,15%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_free
+   1,02%      ksoftirqd/0  [kernel.kallsyms]           [k] tcp_make_synack
+   0,93%      ksoftirqd/0  [kernel.kallsyms]           [k] _raw_spin_lock_bh
+   0,87%      ksoftirqd/0  [kernel.kallsyms]           [k] __call_rcu
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] rt_garbage_collect
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_rules_lookup
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7433819a

01 6月, 2012 9 次提交

sch_atm.c: get rid of poinless extern · d5836751

由 Al Viro 提交于 4月 19, 2012

sockfd_lookup() is declared in linux/net.h, which is pulled by
linux/skbuff.h (and needed for a lot of other stuff in sch_atm.c
anyway).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d5836751

nfsd4: move rq_flavor into svc_cred · d5497fc6

由 J. Bruce Fields 提交于 5月 14, 2012

Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d5497fc6

nfsd4: move principal name into svc_cred · 03a4e1f6

由 J. Bruce Fields 提交于 5月 14, 2012

Instead of keeping the principal name associated with a request in a
structure that's private to auth_gss and using an accessor function,
move it to svc_cred.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

03a4e1f6

J
svcrpc: fix a comment typo · 3ddbe879
由 J. Bruce Fields 提交于 5月 16, 2012
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
3ddbe879

sunrpc: do array overrun check in svc_recv before allocating pages · 91c427ac

由 Jeff Layton 提交于 5月 04, 2012

There's little point in waiting until after we allocate all of the pages
to see if we're going to overrun the array. In the event that this
calculation is really off we could end up scribbling over a bunch of
memory and make it tougher to debug.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

91c427ac

SUNRPC: move per-net operations from svc_destroy() · 786185b5

由 Stanislav Kinsbursky 提交于 5月 04, 2012

The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

786185b5

SUNRPC: new svc_bind() routine introduced · 9793f7c8

由 Stanislav Kinsbursky 提交于 5月 02, 2012

This new routine is responsible for service registration in a specified
network context.

The idea is to separate service creation from per-net operations.

Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9793f7c8

rpc: handle rotated gss data for Windows interoperability · c52226da

由 J. Bruce Fields 提交于 4月 11, 2012

The data in Kerberos gss tokens can be rotated.  But we were lazy and
rejected any nonzero rotation value.  It wasn't necessary for the
implementations we were testing against at the time.

But it appears that Windows does use a nonzero value here.

So, implement rotation to bring ourselves into compliance with the spec
and to interoperate with Windows.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c52226da

net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() · cc9b17ad

由 Jason Wang 提交于 5月 30, 2012

We need to validate the number of pages consumed by data_len, otherwise frags
array could be overflowed by userspace. So this patch validate data_len and
return -EMSGSIZE when data_len may occupies more frags than MAX_SKB_FRAGS.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc9b17ad

30 5月, 2012 3 次提交

drop_monitor: Add module alias to enable automatic module loading · 3fdcbd45

由 Neil Horman 提交于 5月 29, 2012

Now that we have module alias macros for generic netlink families, lets use
those to mark modules with the appropriate family names for loading
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fdcbd45

genetlink: Build a generic netlink family module alias · e9412c37

由 Neil Horman 提交于 5月 29, 2012

Generic netlink searches for -type- formatted aliases when requesting a module to
fulfill a protocol request (i.e. net-pf-16-proto-16-type-<x>, where x is a type
value). However generic netlink protocols have no well defined type numbers,
they have string names. Modify genl_ctrl_getfamily to request an alias in the
format net-pf-16-proto-16-family-<x> instead, where x is a generic string, and
add a macro that builds on the previously added MODULE_ALIAS_NET_PF_PROTO_NAME
macro to allow modules to specifify those generic strings.

Note, l2tp previously hacked together an net-pf-16-proto-16-type-l2tp alias
using the MODULE_ALIAS macro, with these updates we can convert that to use the
PROTO_NAME macro.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: James Chapman <jchapman@katalix.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9412c37

memcg: decrement static keys at real destroy time · 3f134619

由 Glauber Costa 提交于 5月 29, 2012

We call the destroy function when a cgroup starts to be removed, such as
by a rmdir event.

However, because of our reference counters, some objects are still
inflight.  Right now, we are decrementing the static_keys at destroy()
time, meaning that if we get rid of the last static_key reference, some
objects will still have charges, but the code to properly uncharge them
won't be run.

This becomes a problem specially if it is ever enabled again, because now
new charges will be added to the staled charges making keeping it pretty
much impossible.

We just need to be careful with the static branch activation: since there
is no particular preferred order of their activation, we need to make sure
that we only start using it after all call sites are active.  This is
achieved by having a per-memcg flag that is only updated after
static_key_slow_inc() returns.  At this time, we are sure all sites are
active.

This is made per-memcg, not global, for a reason: it also has the effect
of making socket accounting more consistent.  The first memcg to be
limited will trigger static_key() activation, therefore, accounting.  But
all the others will then be accounted no matter what.  After this patch,
only limited memcgs will have its sockets accounted.

[akpm@linux-foundation.org: move enum sock_flag_bits into sock.h,
                            document enum sock_flag_bits,
                            convert memcg_proto_active() and memcg_proto_activated() to test_bit(),
                            redo tcp_update_limit() comment to 80 cols]
Signed-off-by: NGlauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f134619

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多