提交 · 25da0e3e9d3fb2b522bc2a598076735850310eb1 · openanolis / cloud-kernel

05 4月, 2013 2 次提交

Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL" · 25da0e3e

由 Eric W. Biederman 提交于 4月 03, 2013

This reverts commit 14134f65.

The problem that the above patch was meant to address is that af_unix
messages are not being coallesced because we are sending unnecesarry
credentials.  Not sending credentials in maybe_add_creds totally
breaks unconnected unix domain sockets that wish to send credentails
to other sockets.

In practice this break some versions of udev because they receive a
message and the sending uid is bogus so they drop the message.
Reported-by: NSven Joachim <svenjoac@gmx.de>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25da0e3e

net: count hw_addr syncs so that unsync works properly. · 4543fbef

由 Vlad Yasevich 提交于 4月 02, 2013

A few drivers use dev_uc_sync/unsync to synchronize the
address lists from master down to slave/lower devices.  In
some cases (bond/team) a single address list is synched down
to multiple devices.  At the time of unsync, we have a leak
in these lower devices, because "synced" is treated as a
boolean and the address will not be unsynced for anything after
the first device/call.

Treat "synced" as a count (same as refcount) and allow all
unsync calls to work.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4543fbef

03 4月, 2013 4 次提交

netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths · 906b1c39

由 Matthias Schiffer 提交于 3月 30, 2013

The bitmask used for the prefix mangling was being calculated
incorrectly, leading to the wrong part of the address being replaced
when the prefix length wasn't a multiple of 32.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

906b1c39

VSOCK: Handle changes to the VMCI context ID. · 990454b5

由 Reilly Grant 提交于 4月 01, 2013

The VMCI context ID of a virtual machine may change at any time. There
is a VMCI event which signals this but datagrams may be processed before
this is handled. It is therefore necessary to be flexible about the
destination context ID of any datagrams received. (It can be assumed to
be correct because it is provided by the hypervisor.) The context ID on
existing sockets should be updated to reflect how the hypervisor is
currently referring to the system.
Signed-off-by: NReilly Grant <grantr@vmware.com>
Acked-by: NAndy King <acking@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

990454b5

net IPv6 : Fix broken IPv6 routing table after loopback down-up · 25fb6ca4

由 Balakumaran Kannan 提交于 4月 02, 2013

IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo)
interface. After down-up, routes of other interface's IPv6 addresses through
'lo' are lost.

IPv6 addresses assigned to all interfaces are routed through 'lo' for internal
communication. Once 'lo' is down, those routing entries are removed from routing
table. But those removed entries are not being re-created properly when 'lo' is
brought up. So IPv6 addresses of other interfaces becomes unreachable from the
same machine. Also this breaks communication with other machines because of
NDISC packet processing failure.

This patch fixes this issue by reading all interface's IPv6 addresses and adding
them to IPv6 routing table while bringing up 'lo'.

==Testing==
Before applying the patch:
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$

After applying the patch:
$ route -A inet6
Kernel IPv6 routing
table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$
Signed-off-by: NBalakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
Signed-off-by: NMaruthi Thotad <Maruthi.Thotad@ap.sony.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25fb6ca4

cbq: incorrect processing of high limits · f0f6ee1f

由 Vasily Averin 提交于 4月 01, 2013

currently cbq works incorrectly for limits > 10% real link bandwidth,
and practically does not work for limits > 50% real link bandwidth.
Below are results of experiments taken on 1 Gbit link

 In shaper | Actual Result
-----------+---------------
  100M     | 108 Mbps
  200M     | 244 Mbps
  300M     | 412 Mbps
  500M     | 893 Mbps

This happen because of q->now changes incorrectly in cbq_dequeue():
when it is called before real end of packet transmitting,
L2T is greater than real time delay, q_now gets an extra boost
but never compensate it.

To fix this problem we prevent change of q->now until its synchronization
with real time.
Signed-off-by: NVasily Averin <vvs@openvz.org>
Reviewed-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0f6ee1f

30 3月, 2013 5 次提交

net: add a synchronize_net() in netdev_rx_handler_unregister() · 00cfec37

由 Eric Dumazet 提交于 3月 29, 2013

commit 35d48903 (bonding: fix rx_handler locking) added a race
in bonding driver, reported by Steven Rostedt who did a very good
diagnosis :

<quoting Steven>

I'm currently debugging a crash in an old 3.0-rt kernel that one of our
customers is seeing. The bug happens with a stress test that loads and
unloads the bonding module in a loop (I don't know all the details as
I'm not the one that is directly interacting with the customer). But the
bug looks to be something that may still be present and possibly present
in mainline too. It will just be much harder to trigger it in mainline.

In -rt, interrupts are threads, and can schedule in and out just like
any other thread. Note, mainline now supports interrupt threads so this
may be easily reproducible in mainline as well. I don't have the ability
to tell the customer to try mainline or other kernels, so my hands are
somewhat tied to what I can do.

But according to a core dump, I tracked down that the eth irq thread
crashed in bond_handle_frame() here:

        slave = bond_slave_get_rcu(skb->dev);
        bond = slave->bond; <--- BUG

the slave returned was NULL and accessing slave->bond caused a NULL
pointer dereference.

Looking at the code that unregisters the handler:

void netdev_rx_handler_unregister(struct net_device *dev)
{

        ASSERT_RTNL();
        RCU_INIT_POINTER(dev->rx_handler, NULL);
        RCU_INIT_POINTER(dev->rx_handler_data, NULL);
}

Which is basically:
        dev->rx_handler = NULL;
        dev->rx_handler_data = NULL;

And looking at __netif_receive_skb() we have:

        rx_handler = rcu_dereference(skb->dev->rx_handler);
        if (rx_handler) {
                if (pt_prev) {
                        ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = NULL;
                }
                switch (rx_handler(&skb)) {

My question to all of you is, what stops this interrupt from happening
while the bonding module is unloading?  What happens if the interrupt
triggers and we have this:

        CPU0                    CPU1
        ----                    ----
  rx_handler = skb->dev->rx_handler

                        netdev_rx_handler_unregister() {
                           dev->rx_handler = NULL;
                           dev->rx_handler_data = NULL;

  rx_handler()
   bond_handle_frame() {
    slave = skb->dev->rx_handler;
    bond = slave->bond; <-- NULL pointer dereference!!!

What protection am I missing in the bond release handler that would
prevent the above from happening?

</quoting Steven>

We can fix bug this in two ways. First is adding a test in
bond_handle_frame() and others to check if rx_handler_data is NULL.

A second way is adding a synchronize_net() in
netdev_rx_handler_unregister() to make sure that a rcu protected reader
has the guarantee to see a non NULL rx_handler_data.

The second way is better as it avoids an extra test in fast path.
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00cfec37

net: fq_codel: Fix off-by-one error · cd68ddd4

由 Vijay Subramanian 提交于 3月 28, 2013

Currently, we hold a max of sch->limit -1 number of packets instead of
sch->limit packets. Fix this off-by-one error.
Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd68ddd4

net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' · a561cf7e

由 Shmulik Ladkani 提交于 3月 27, 2013

'nf_reset' is called just prior calling 'netif_rx'.
No need to call it twice.
Reported-by: NIgor Michailov <rgohita@gmail.com>
Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a561cf7e

net: fix the use of this_cpu_ptr · 50eab050

由 Li RongQing 提交于 3月 27, 2013

flush_tasklet is not percpu var, and percpu is percpu var, and
	this_cpu_ptr(&info->cache->percpu->flush_tasklet)
is not equal to
	&this_cpu_ptr(info->cache->percpu)->flush_tasklet

1f743b07(use this_cpu_ptr per-cpu helper) introduced this bug.
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50eab050

ipv6: don't accept node local multicast traffic from the wire · 1c4a154e

由 Hannes Frederic Sowa 提交于 3月 26, 2013

Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been
verified: http://www.rfc-editor.org/errata_search.php?eid=3480

We have to check for pkt_type and loopback flag because either the
packets are allowed to travel over the loopback interface (in which case
pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel
over a non-loopback interface back to us (in which case PACKET_TYPE is
PACKET_LOOPBACK and IFF_LOOPBACK flag is not set).

Cc: Erik Hugne <erik.hugne@ericsson.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c4a154e

28 3月, 2013 3 次提交

sch: add missing u64 in psched_ratecfg_precompute() · ea872d77

由 Sergey Popovich 提交于 3月 27, 2013

It seems that commit

commit 292f1c7f
Author: Jiri Pirko <jiri@resnulli.us>
Date:   Tue Feb 12 00:12:03 2013 +0000

    sch: make htb_rate_cfg and functions around that generic

adds little regression.

Before:

# tc qdisc add dev eth0 root handle 1: htb default ffff
# tc class add dev eth0 classid 1:ffff htb rate 5Gbit
# tc -s class show dev eth0
class htb 1:ffff root prio 0 rate 5000Mbit ceil 5000Mbit burst 625b cburst
625b
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 31 ctokens: 31

After:

# tc qdisc add dev eth0 root handle 1: htb default ffff
# tc class add dev eth0 classid 1:ffff htb rate 5Gbit
# tc -s class show dev eth0
class htb 1:ffff root prio 0 rate 1544Mbit ceil 1544Mbit burst 625b cburst
625b
 Sent 5073 bytes 41 pkt (dropped 0, overlimits 0 requeues 0)
 rate 1976bit 2pps backlog 0b 0p requeues 0
 lended: 41 borrowed: 0 giants: 0
 tokens: 1802 ctokens: 1802

This probably due to lost u64 cast of rate parameter in
psched_ratecfg_precompute() (net/sched/sch_generic.c).
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ru>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea872d77

rtnetlink: fix error return code in rtnl_link_fill() · fcca143d

由 Wei Yongjun 提交于 3月 27, 2013

Fix to return a negative error code from the error handling case
instead of 0(possible overwrite to 0 by ops->fill_xstats call),
as returned elsewhere in this function.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcca143d

netfilter: nf_conntrack: fix error return code · 5389090b

由 Wei Yongjun 提交于 3月 27, 2013

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in function
nf_conntrack_standalone_init().
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

5389090b

27 3月, 2013 2 次提交

ipv4: Fix ip-header identification for gso packets. · 330305cc

由 Pravin B Shelar 提交于 3月 24, 2013

ip-header id needs to be incremented even if IP_DF flag is set.
This behaviour was changed in commit 490ab081
(IP_GRE: Fix IP-Identification).

Following patch fixes it so that identification is always
incremented.
Reported-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

330305cc

af_unix: dont send SCM_CREDENTIAL when dest socket is NULL · 14134f65

由 dingtianhong 提交于 3月 25, 2013

SCM_SCREDENTIALS should apply to write() syscalls only either source or destination
socket asserted SOCK_PASSCRED. The original implememtation in maybe_add_creds is wrong,
and breaks several LSB testcases ( i.e. /tset/LSB.os/netowkr/recvfrom/T.recvfrom).
Origionally-authored-by: NKarel Srot <ksrot@redhat.com>
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14134f65

26 3月, 2013 3 次提交

NFC: llcp: Keep the connected socket parent pointer alive · 39a352a5

由 Samuel Ortiz 提交于 3月 26, 2013

And avoid decreasing the ack log twice when dequeueing connected LLCP
sockets.
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

39a352a5

ipv6: fix bad free of addrconf_init_net · a79ca223

由 Hong Zhiguo 提交于 3月 26, 2013

Signed-off-by: NHong Zhiguo <honkiko@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a79ca223

unix: fix a race condition in unix_release() · ded34e0f

由 Paul Moore 提交于 3月 25, 2013

As reported by Jan, and others over the past few years, there is a
race condition caused by unix_release setting the sock->sk pointer
to NULL before properly marking the socket as dead/orphaned.  This
can cause a problem with the LSM hook security_unix_may_send() if
there is another socket attempting to write to this partially
released socket in between when sock->sk is set to NULL and it is
marked as dead/orphaned.  This patch fixes this by only setting
sock->sk to NULL after the socket has been marked as dead; I also
take the opportunity to make unix_release_sock() a void function
as it only ever returned 0/success.

Dave, I think this one should go on the -stable pile.

Special thanks to Jan for coming up with a reproducer for this
problem.
Reported-by: NJan Stancek <jan.stancek@gmail.com>
Signed-off-by: NPaul Moore <pmoore@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ded34e0f

25 3月, 2013 10 次提交

mac80211: fix idle handling sequence · 382a103b

由 Johannes Berg 提交于 3月 22, 2013

Corey Richardson reported that my idle handling cleanup
(commit fd0f979a, "mac80211: simplify idle handling")
broke ath9k_htc. The reason appears to be that it wants
to go out of idle before switching channels. To fix it,
reimplement that sequence.
Reported-by: NCorey Richardson <corey@octayn.net>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

382a103b

SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked · 1166fde6

由 Trond Myklebust 提交于 3月 25, 2013

We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

1166fde6

netfilter: nfnetlink_acct: return -EINVAL if object name is empty · deadcfc3

由 Pablo Neira Ayuso 提交于 3月 23, 2013

If user-space tries to create accounting object with an empty
name, then return -EINVAL.
Reported-by: NMichael Zintakis <michael.zintakis@googlemail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

deadcfc3

netfilter: nfnetlink_queue: fix error return code in nfnetlink_queue_init() · 558724a5

由 Wei Yongjun 提交于 3月 22, 2013

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

558724a5

mac80211: fix remain-on-channel cancel crash · 3fbd45ca

由 Johannes Berg 提交于 3月 25, 2013

If a ROC item is canceled just as it expires, the work
struct may be scheduled while it is running (and waiting
for the mutex). This results in it being run after being
freed, which obviously crashes.

To fix this don't free it when aborting is requested but
instead mark it as "to be freed", which makes the work a
no-op and allows freeing it outside.

Cc: stable@vger.kernel.org [3.6+]
Reported-by: NJouni Malinen <j@w1.fi>
Tested-by: NJouni Malinen <j@w1.fi>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

3fbd45ca

xfrm: Fix esn sequence number diff calculation in xfrm_replay_notify_esn() · 799ef90c

由 Mathias Krause 提交于 3月 20, 2013

Commit 0017c0b5 "xfrm: Fix replay notification for esn." is off by one
for the sequence number wrapped case as UINT_MAX is 0xffffffff, not
0x100000000. ;)

Just calculate the diff like done everywhere else in the file.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

799ef90c

tcp: undo spurious timeout after SACK reneging · 7ebe183c

由 Yuchung Cheng 提交于 3月 24, 2013

On SACK reneging the sender immediately retransmits and forces a
timeout but disables Eifel (undo). If the (buggy) receiver does not
drop any packet this can trigger a false slow-start retransmit storm
driven by the ACKs of the original packets. This can be detected with
undo and TCP timestamps.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ebe183c

bridge: fix crash when set mac address of br interface · 9b46922e

由 Hong zhi guo 提交于 3月 23, 2013

When I tried to set mac address of a bridge interface to a mac
address which already learned on this bridge, I got system hang.

The cause is straight forward: function br_fdb_change_mac_address
calls fdb_insert with NULL source nbp. Then an fdb lookup is
performed. If an fdb entry is found and it's local, it's OK. But
if it's not local, source is dereferenced for printk without NULL
check.
Signed-off-by: NHong Zhiguo <honkiko@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b46922e

8021q: fix a potential use-after-free · 4a7df340

由 Cong Wang 提交于 3月 22, 2013

vlan_vid_del() could possibly free ->vlan_info after a RCU grace
period, however, we may still refer to the freed memory area
by 'grp' pointer. Found by code inspection.

This patch moves vlan_vid_del() as behind as possible.

Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a7df340

net: remove a WARN_ON() in net_enable_timestamp() · 9979a55a

由 Eric Dumazet 提交于 3月 22, 2013

The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false
positive, in socket clone path, run from softirq context :

[ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80()
[ 3641.668811] Call Trace:
[ 3641.671254]  <IRQ>  [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0
[ 3641.677871]  [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20
[ 3641.683683]  [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80
[ 3641.689668]  [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450
[ 3641.695222]  [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170
[ 3641.701213]  [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820
[ 3641.707663]  [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670
[ 3641.713354]  [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0
[ 3641.719425]  [<ffffffff807af63a>] tcp_check_req+0x3da/0x530
[ 3641.724979]  [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80
[ 3641.730964]  [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0
[ 3641.736430]  [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0
[ 3641.741985]  [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0

Its safe at this point because the parent socket owns a reference
on the netstamp_needed, so we cant have a 0 -> 1 transition, which
requires to lock a mutex.

Instead of refining the check, lets remove it, as all known callers
are safe. If it ever changes in the future, static_key_slow_inc()
will complain anyway.
Reported-by: NLaurent Chavey <chavey@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9979a55a

24 3月, 2013 2 次提交

mac80211: Don't restart sta-timer if not associated. · 370bd005

由 Ben Greear 提交于 3月 19, 2013

I found another crash when deleting lots of virtual stations
in a congested environment.  I think the problem is that
the ieee80211_mlme_notify_scan_completed could call
ieee80211_restart_sta_timer for a stopped interface
that was about to be deleted.

With the following patch I am unable to reproduce the
crash.
Signed-off-by: NBen Greear <greearb@candelatech.com>
[move check, also make the same change in mesh]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

370bd005

cfg80211: always check for scan end on P2P device · f9f47529

由 Johannes Berg 提交于 3月 19, 2013

If a P2P device wdev is removed while it has a scan, then the
scan completion might crash later as it is already freed by
that time. To avoid the crash always check the scan completion
when the P2P device is being removed for some reason. If the
driver already canceled it, don't want and free it, otherwise
warn and leak it to avoid later crashes.

In order to do this, locking needs to be changed away from the
rdev mutex (which can't always be guaranteed). For now, use
the sched_scan_mtx instead, I'll rename it to just scan_mtx in
a later patch.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

f9f47529

22 3月, 2013 1 次提交

tcp: preserve ACK clocking in TSO · f4541d60

由 Eric Dumazet 提交于 3月 21, 2013

A long standing problem with TSO is the fact that tcp_tso_should_defer()
rearms the deferred timer, while it should not.

Current code leads to following bad bursty behavior :

20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119
20:11:24.484337 IP B > A: . ack 263721 win 1117
20:11:24.485086 IP B > A: . ack 265241 win 1117
20:11:24.485925 IP B > A: . ack 266761 win 1117
20:11:24.486759 IP B > A: . ack 268281 win 1117
20:11:24.487594 IP B > A: . ack 269801 win 1117
20:11:24.488430 IP B > A: . ack 271321 win 1117
20:11:24.489267 IP B > A: . ack 272841 win 1117
20:11:24.490104 IP B > A: . ack 274361 win 1117
20:11:24.490939 IP B > A: . ack 275881 win 1117
20:11:24.491775 IP B > A: . ack 277401 win 1117
20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119
20:11:24.492620 IP B > A: . ack 278921 win 1117
20:11:24.493448 IP B > A: . ack 280441 win 1117
20:11:24.494286 IP B > A: . ack 281961 win 1117
20:11:24.495122 IP B > A: . ack 283481 win 1117
20:11:24.495958 IP B > A: . ack 285001 win 1117
20:11:24.496791 IP B > A: . ack 286521 win 1117
20:11:24.497628 IP B > A: . ack 288041 win 1117
20:11:24.498459 IP B > A: . ack 289561 win 1117
20:11:24.499296 IP B > A: . ack 291081 win 1117
20:11:24.500133 IP B > A: . ack 292601 win 1117
20:11:24.500970 IP B > A: . ack 294121 win 1117
20:11:24.501388 IP B > A: . ack 295641 win 1117
20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119

While the expected behavior is more like :

20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119
20:19:49.260446 IP B > A: . ack 154281 win 1212
20:19:49.261282 IP B > A: . ack 155801 win 1212
20:19:49.262125 IP B > A: . ack 157321 win 1212
20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119
20:19:49.262958 IP B > A: . ack 158841 win 1212
20:19:49.263795 IP B > A: . ack 160361 win 1212
20:19:49.264628 IP B > A: . ack 161881 win 1212
20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119
20:19:49.265465 IP B > A: . ack 163401 win 1212
20:19:49.265886 IP B > A: . ack 164921 win 1212
20:19:49.266722 IP B > A: . ack 166441 win 1212
20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119
20:19:49.267559 IP B > A: . ack 167961 win 1212
20:19:49.268394 IP B > A: . ack 169481 win 1212
20:19:49.269232 IP B > A: . ack 171001 win 1212
20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4541d60

21 3月, 2013 8 次提交

mac80211: fix virtual monitor interface locking · 8b305780

由 Johannes Berg 提交于 3月 20, 2013

The virtual monitor interface has a locking issue, it calls
into the channel context code with the iflist mutex held
which isn't allowed since it is usually acquired the other
way around. The mutex is still required for the interface
iteration, but need not be held across the channel calls.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

8b305780

cfg80211: fix wdev tracing crash · ce1eadda

由 Johannes Berg 提交于 3月 19, 2013

Arend reported a crash in tracing if the driver returns an
ERR_PTR() value from the add_virtual_intf() callback. This
is due to the tracing then still attempting to dereference
the "pointer", fix this by using IS_ERR_OR_NULL().
Reported-by: NArend van Spriel <arend@broadcom.com>
Tested-by: NArend van Spriel <arend@broadcom.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

ce1eadda

net/irda: add missing error path release_sock call · 896ee0ee

由 Kees Cook 提交于 3月 20, 2013

This makes sure that release_sock is called for all error conditions in
irda_getsockopt.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reported-by: NBrad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

896ee0ee

ipconfig: Fix newline handling in log message. · 283951f9

由 Martin Fuzzey 提交于 3月 19, 2013

When using ipconfig the logs currently look like:

Single name server:
[    3.467270] IP-Config: Complete:
[    3.470613]      device=eth0, hwaddr=ac:de:48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1
[    3.480670]      host=infigo-1, domain=, nis-domain=(none)
[    3.486166]      bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath=
[    3.492910]      nameserver0=172.16.42.1[    3.496853] ALSA device list:

Three name servers:
[    3.496949] IP-Config: Complete:
[    3.500293]      device=eth0, hwaddr=ac:de:48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1
[    3.510367]      host=infigo-1, domain=, nis-domain=(none)
[    3.515864]      bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath=
[    3.522635]      nameserver0=172.16.42.1, nameserver1=172.16.42.100
[    3.529149] , nameserver2=172.16.42.200

Fix newline handling for these cases
Signed-off-by: NMartin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

283951f9

flow_keys: include thoff into flow_keys for later usage · 8ed78166

由 Daniel Borkmann 提交于 3月 19, 2013

In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
doing the work here anyway, also store thoff for a later usage, e.g. in
the BPF filter.
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ed78166

l2tp: unhash l2tp sessions on delete, not on free · f6e16b29

由 Tom Parkin 提交于 3月 19, 2013

If we postpone unhashing of l2tp sessions until the structure is freed, we
risk:

 1. further packets arriving and getting queued while the pseudowire is being
    closed down
 2. the recv path hitting "scheduling while atomic" errors in the case that
    recv drops the last reference to a session and calls l2tp_session_free
    while in atomic context

As such, l2tp sessions should be unhashed from l2tp_core data structures early
in the teardown process prior to calling pseudowire close.  For pseudowires
like l2tp_ppp which have multiple shutdown codepaths, provide an unhash hook.
Signed-off-by: NTom Parkin <tparkin@katalix.com>
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6e16b29

l2tp: avoid deadlock in l2tp stats update · 7b7c0719

由 Tom Parkin 提交于 3月 19, 2013

l2tp's u64_stats writers were incorrectly synchronised, making it possible to
deadlock a 64bit machine running a 32bit kernel simply by sending the l2tp
code netlink commands while passing data through l2tp sessions.

Previous discussion on netdev determined that alternative solutions such as
spinlock writer synchronisation or per-cpu data would bring unjustified
overhead, given that most users interested in high volume traffic will likely
be running 64bit kernels on 64bit hardware.

As such, this patch replaces l2tp's use of u64_stats with atomic_long_t,
thereby avoiding the deadlock.

Ref:
http://marc.info/?l=linux-netdev&m=134029167910731&w=2
http://marc.info/?l=linux-netdev&m=134079868111131&w=2Signed-off-by: NTom Parkin <tparkin@katalix.com>
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b7c0719

l2tp: push all ppp pseudowire shutdown through .release handler · cf2f5c88

由 Tom Parkin 提交于 3月 19, 2013

If userspace deletes a ppp pseudowire using the netlink API, either by
directly deleting the session or by deleting the tunnel that contains the
session, we need to tear down the corresponding pppox channel.

Rather than trying to manage two pppox unbind codepaths, switch the netlink
and l2tp_core session_close handlers to close via. the l2tp_ppp socket
.release handler.
Signed-off-by: NTom Parkin <tparkin@katalix.com>
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf2f5c88

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功