提交 · 0508019cca6c425a9db79b2b3aa37afba0a5542b · openeuler / raspberrypi-kernel

21 11月, 2013 11 次提交

由 Andy Fleming 提交于 11月 20, 2013

Vitesse VSC8234 is quad port 10/100/1000BASE-T PHY
with SGMII and SERDES MAC interfaces.
Signed-off-by: NAndy Fleming <afleming@gmail.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NShruti Kanetkar <Shruti@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0508019c

net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) · 68c6beb3

由 Hannes Frederic Sowa 提交于 11月 21, 2013

In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.

The BUG_ON may be removed in future if we are sure all protocols are
conformant.
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68c6beb3

net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426

由 Hannes Frederic Sowa 提交于 11月 21, 2013

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
	msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3d33426

bridge: flush br's address entry in fdb when remove the · f8730420

由 Ding Tianhong 提交于 12月 07, 2013

 bridge dev

When the following commands are executed:

brctl addbr br0
ifconfig br0 hw ether <addr>
rmmod bridge

The calltrace will occur:

[  563.312114] device eth1 left promiscuous mode
[  563.312188] br0: port 1(eth1) entered disabled state
[  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
[  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
[  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
[  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
[  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
[  563.468209] Call Trace:
[  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
[  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
[  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
[  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
[  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
[  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
[  570.377958] Bridge firewalling registered

--------------------------- cut here -------------------------------

The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.

v2: according to the Toshiaki Makita and Vlad's suggestion, I only
    delete the vlan0 entry, it still have a leak here if the vlan id
    is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
    to flush all entries whose dst is NULL for the bridge.
Suggested-by: NToshiaki Makita <toshiaki.makita1@gmail.com>
Suggested-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8730420

net: core: Always propagate flag changes to interfaces · d2615bf4

由 Vlad Yasevich 提交于 11月 19, 2013

The following commit:
    b6c40d68
    net: only invoke dev->change_rx_flags when device is UP

tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.

A later commit:
    deede2fa
    vlan: Don't propagate flag changes on down interfaces

fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.

The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices.  A simple examle of the scenario is the
following:

  eth0----> bond0 ----> bridge0 ---> vlan50

If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge.  As a
result, packets with vlan50 will be dropped by the interface.

The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation.  As a result we can remove the generic solution
introduced in b6c40d68 and leave
it to the individual devices to decide whether they will block
flag propagation or not.
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2615bf4

ipv4: fix race in concurrent ip_route_input_slow() · dcdfdf56

由 Alexei Starovoitov 提交于 11月 19, 2013

CPUs can ask for local route via ip_route_input_noref() concurrently.
if nh_rth_input is not cached yet, CPUs will proceed to allocate
equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
via rt_cache_route()
Most of the time they succeed, but on occasion the following two lines:
	orig = *p;
	prev = cmpxchg(p, orig, rt);
in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
so dst is leaking. dst_destroy() is never called and 'lo' device
refcnt doesn't go to zero, which can be seen in the logs as:
	unregister_netdevice: waiting for lo to become free. Usage count = 1
Adding mdelay() between above two lines makes it easily reproducible.
Fix it similar to nh_pcpu_rth_output case.

Fixes: d2d68ba9 ("ipv4: Cache input routes in fib_info nexthops.")
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dcdfdf56

Merge branch 'r8152' · 4f837c3b

由 David S. Miller 提交于 11月 20, 2013

Hayes Wang says:

====================
r8152 bug fixes

For the patch #3, I add netif_tx_lock() before checking the
netif_queue_stopped(). Besides, I add checking the skb queue
length before waking the tx queue.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f837c3b

r8152: fix incorrect type in assignment · 500b6d7e

由 hayeswang 提交于 11月 20, 2013

The data from the hardware should be little endian. Correct the
declaration.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

500b6d7e

r8152: support stopping/waking tx queue · dd1b119c

由 hayeswang 提交于 11月 20, 2013

The maximum packet number which a tx aggregation buffer could contain
is the tx_qlen.

	tx_qlen = buffer size / (packet size + descriptor size).

If the tx buffer is empty and the queued packets are more than the
maximum value which is defined above, stop the tx queue. Wake the
tx queue if tx queue is stopped and the queued packets are less than
tx_qlen.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd1b119c

r8152: modify the tx flow · 61598788

由 hayeswang 提交于 11月 20, 2013

Remove the code for sending the packet in the rtl8152_start_xmit().
Let rtl8152_start_xmit() to queue the packet only, and schedule a
tasklet to send the queued packets. This simplify the code and make
sure all the packet would be sent by the original order.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61598788

r8152: fix tx/rx memory overflow · 7937f9e5

由 hayeswang 提交于 11月 20, 2013

The tx/rx would access the memory which is out of the desired range.
Modify the method of checking the end of the memory to avoid it.

For r8152_tx_agg_fill(), the variable remain may become negative.
However, the declaration is unsigned, so the while loop wouldn't
break when reaching the end of the desied memory. Although to change
the declaration from unsigned to signed is enough to fix it, I also
modify the checking method for safe. Replace

		remain = rx_buf_sz - sizeof(*tx_desc) -
			 (u32)((void *)tx_data - agg->head);

with

		remain = rx_buf_sz - (int)(tx_agg_align(tx_data) - agg->head);

to make sure the variable remain is always positive. Then, the
overflow wouldn't happen.

For rx_bottom(), the rx_desc should not be used to calculate the
packet length before making sure the rx_desc is in the desired range.
Change the checking to two parts. First, check the descriptor is in
the memory. The other, using the descriptor to find out the packet
length and check if the packet is in the memory.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7937f9e5

20 11月, 2013 24 次提交

aacraid: prevent invalid pointer dereference · b4789b8e

由 Mahesh Rajashekhara 提交于 10月 31, 2013

It appears that driver runs into a problem here if fibsize is too small
because we allocate user_srbcmd with fibsize size only but later we
access it until user_srbcmd->sg.count to copy it over to srbcmd.

It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this
structure already includes one sg element and this is not needed for
commands without data.  So, we would recommend to add the following
(instead of test for fibsize == 0).
Signed-off-by: NMahesh Rajashekhara <Mahesh.Rajashekhara@pmcs.com>
Reported-by: NNico Golde <nico@ngolde.de>
Reported-by: NFabian Yamaguchi <fabs@goesec.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b4789b8e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1ee2dcc2

由 Linus Torvalds 提交于 11月 19, 2013

Pull networking fixes from David Miller:
 "Mostly these are fixes for fallout due to merge window changes, as
  well as cures for problems that have been with us for a much longer
  period of time"

 1) Johannes Berg noticed two major deficiencies in our genetlink
    registration.  Some genetlink protocols we passing in constant
    counts for their ops array rather than something like
    ARRAY_SIZE(ops) or similar.  Also, some genetlink protocols were
    using fixed IDs for their multicast groups.

    We have to retain these fixed IDs to keep existing userland tools
    working, but reserve them so that other multicast groups used by
    other protocols can not possibly conflict.

    In dealing with these two problems, we actually now use less state
    management for genetlink operations and multicast groups.

 2) When configuring interface hardware timestamping, fix several
    drivers that simply do not validate that the hwtstamp_config value
    is one the driver actually supports.  From Ben Hutchings.

 3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.

 4) In dev_forward_skb(), set the skb->protocol in the right order
    relative to skb_scrub_packet().  From Alexei Starovoitov.

 5) Bridge erroneously fails to use the proper wrapper functions to make
    calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid.  Fix from Toshiaki
    Makita.

 6) When detaching a bridge port, make sure to flush all VLAN IDs to
    prevent them from leaking, also from Toshiaki Makita.

 7) Put in a compromise for TCP Small Queues so that deep queued devices
    that delay TX reclaim non-trivially don't have such a performance
    decrease.  One particularly problematic area is 802.11 AMPDU in
    wireless.  From Eric Dumazet.

 8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
    here.  Fix from Eric Dumzaet, reported by Dave Jones.

 9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.

10) When computing mergeable buffer sizes, virtio-net fails to take the
    virtio-net header into account.  From Michael Dalton.

11) Fix seqlock deadlock in ip4_datagram_connect() wrt.  statistic
    bumping, this one has been with us for a while.  From Eric Dumazet.

12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
    Hugne.

13) 6lowpan bit used for traffic classification was wrong, from Jukka
    Rissanen.

14) macvlan has the same issue as normal vlans did wrt.  propagating LRO
    disabling down to the real device, fix it the same way.  From Michal
    Kubecek.

15) CPSW driver needs to soft reset all slaves during suspend, from
    Daniel Mack.

16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.

17) The xen-netfront RX buffer refill timer isn't properly scheduled on
    partial RX allocation success, from Ma JieYue.

18) When ipv6 ping protocol support was added, the AF_INET6 protocol
    initialization cleanup path on failure was borked a little.  Fix
    from Vlad Yasevich.

19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
    blocks we can do the wrong thing with the msg_name we write back to
    userspace.  From Hannes Frederic Sowa.  There is another fix in the
    works from Hannes which will prevent future problems of this nature.

20) Fix route leak in VTI tunnel transmit, from Fan Du.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
  genetlink: make multicast groups const, prevent abuse
  genetlink: pass family to functions using groups
  genetlink: add and use genl_set_err()
  genetlink: remove family pointer from genl_multicast_group
  genetlink: remove genl_unregister_mc_group()
  hsr: don't call genl_unregister_mc_group()
  quota/genetlink: use proper genetlink multicast APIs
  drop_monitor/genetlink: use proper genetlink multicast APIs
  genetlink: only pass array to genl_register_family_with_ops()
  tcp: don't update snd_nxt, when a socket is switched from repair mode
  atm: idt77252: fix dev refcnt leak
  xfrm: Release dst if this dst is improper for vti tunnel
  netlink: fix documentation typo in netlink_set_err()
  be2net: Delete secondary unicast MAC addresses during be_close
  be2net: Fix unconditional enabling of Rx interface options
  net, virtio_net: replace the magic value
  ping: prevent NULL pointer dereference on write to msg_name
  bnx2x: Prevent "timeout waiting for state X"
  bnx2x: prevent CFC attention
  bnx2x: Prevent panic during DMAE timeout
  ...

1ee2dcc2

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 4457e6f6

由 Linus Torvalds 提交于 11月 19, 2013

Pull sparc fixes from David Miller:
 "Two merge window fallout build fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: merge fix
  sparc64: fix build regession

4457e6f6

Merge tag 'please-pull-fixia64' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · e87e7be9

由 Linus Torvalds 提交于 11月 19, 2013

Pull ia64 fix from Tony Luck:
 "Unbreak ia64 build by avoiding circular dependency"

* tag 'please-pull-fixia64' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  kernel/bounds: avoid circular dependencies in generated headers

e87e7be9

kernel/bounds: avoid circular dependencies in generated headers · 24b9fdc5

由 Kirill A. Shutemov 提交于 11月 18, 2013

<linux/spinlock.h> has heavy dependencies on other header files.
It triggers circular dependencies in generated headers on IA64, at
least:

  CC      kernel/bounds.s
In file included from /home/space/kas/git/public/linux/arch/ia64/include/asm/thread_info.h:9:0,
                 from include/linux/thread_info.h:54,
                 from include/asm-generic/preempt.h:4,
                 from arch/ia64/include/generated/asm/preempt.h:1,
                 from include/linux/preempt.h:18,
                 from include/linux/spinlock.h:50,
                 from kernel/bounds.c:14:
/home/space/kas/git/public/linux/arch/ia64/include/asm/asm-offsets.h:1:35: fatal error: generated/asm-offsets.h: No such file or directory
compilation terminated.

Let's replace <linux/spinlock.h> with <linux/spinlock_types.h>, it's
enough to find out size of spinlock_t.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-and-Tested-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

24b9fdc5

Merge branch 'genetlink_mcast' · 091e0662

由 David S. Miller 提交于 11月 19, 2013

Johannes Berg says:

====================
genetlink: clean up multicast group APIs

The generic netlink multicast group registration doesn't have to
be dynamic, and can thus be simplified just like I did with the
ops. This removes some complexity in registration code.

Additionally, two users of generic netlink already use multicast
groups in a wrong way, add workarounds for those two to keep the
userspace API working, but at the same time make them not clash
with other users of multicast groups as might happen now.

While making it all a bit easier, also prevent such abuse by adding
checks to the APIs so each family can only use the groups it owns.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

091e0662

genetlink: make multicast groups const, prevent abuse · 2a94fe48

由 Johannes Berg 提交于 11月 19, 2013

Register generic netlink multicast groups as an array with
the family and give them contiguous group IDs. Then instead
of passing the global group ID to the various functions that
send messages, pass the ID relative to the family - for most
families that's just 0 because the only have one group.

This avoids the list_head and ID in each group, adding a new
field for the mcast group ID offset to the family.

At the same time, this allows us to prevent abusing groups
again like the quota and dropmon code did, since we can now
check that a family only uses a group it owns.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a94fe48

genetlink: pass family to functions using groups · 68eb5503

由 Johannes Berg 提交于 11月 19, 2013

This doesn't really change anything, but prepares for the
next patch that will change the APIs to pass the group ID
within the family, rather than the global group ID.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68eb5503

genetlink: add and use genl_set_err() · 62b68e99

由 Johannes Berg 提交于 11月 19, 2013

Add a static inline to generic netlink to wrap netlink_set_err()
to make it easier to use here - use it in openvswitch (the only
generic netlink user of netlink_set_err()).
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62b68e99

genetlink: remove family pointer from genl_multicast_group · c2ebb908

由 Johannes Berg 提交于 11月 19, 2013

There's no reason to have the family pointer there since it
can just be passed internally where needed, so remove it.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2ebb908

genetlink: remove genl_unregister_mc_group() · 06fb555a

由 Johannes Berg 提交于 11月 19, 2013

There are no users of this API remaining, and we'll soon
change group registration to be static (like ops are now)
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06fb555a

hsr: don't call genl_unregister_mc_group() · 03ed3827

由 Johannes Berg 提交于 11月 19, 2013

There's no need to unregister the multicast group if the
generic netlink family is registered immediately after.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03ed3827

quota/genetlink: use proper genetlink multicast APIs · 2ecf7536

由 Johannes Berg 提交于 11月 19, 2013

The quota code is abusing the genetlink API and is using
its family ID as the multicast group ID, which is invalid
and may belong to somebody else (and likely will.)

Make the quota code use the correct API, but since this
is already used as-is by userspace, reserve a family ID
for this code and also reserve that group ID to not break
userspace assumptions.
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ecf7536

drop_monitor/genetlink: use proper genetlink multicast APIs · e5dcecba

由 Johannes Berg 提交于 11月 19, 2013

The drop monitor code is abusing the genetlink API and is
statically using the generic netlink multicast group 1, even
if that group belongs to somebody else (which it invariably
will, since it's not reserved.)

Make the drop monitor code use the proper APIs to reserve a
group ID, but also reserve the group id 1 in generic netlink
code to preserve the userspace API. Since drop monitor can
be a module, don't clear the bit for it on unregistration.
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5dcecba

genetlink: only pass array to genl_register_family_with_ops() · c53ed742

由 Johannes Berg 提交于 11月 19, 2013

As suggested by David Miller, make genl_register_family_with_ops()
a macro and pass only the array, evaluating ARRAY_SIZE() in the
macro, this is a little safer.

The openvswitch has some indirection, assing ops/n_ops directly in
that code. This might ultimately just assign the pointers in the
family initializations, saving the struct genl_family_and_ops and
code (once mcast groups are handled differently.)
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c53ed742

tcp: don't update snd_nxt, when a socket is switched from repair mode · dbde4979

由 Andrey Vagin 提交于 11月 19, 2013

snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
	  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
	  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
			tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
		tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbde4979

atm: idt77252: fix dev refcnt leak · b5de4a22

由 Ying Xue 提交于 11月 19, 2013

init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5de4a22

xfrm: Release dst if this dst is improper for vti tunnel · 236c9f84

由 fan.du 提交于 11月 19, 2013

After searching rt by the vti tunnel dst/src parameter,
if this rt has neither attached to any transformation
nor the transformation is not tunnel oriented, this rt
should be released back to ip layer.

otherwise causing dst memory leakage.
Signed-off-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

236c9f84

netlink: fix documentation typo in netlink_set_err() · 840e93f2

由 Johannes Berg 提交于 11月 19, 2013

The parameter is just 'group', not 'groups', fix the documentation typo.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

840e93f2

Merge tag 'arc-v3.13-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · dec8e461

由 Linus Torvalds 提交于 11月 19, 2013

Pull second set of ARC changes from Vineet Gupta:
 - Support for Perf from Mischa
 - Enabling GPIO/Pinctrl drivers for Abilis TB10x platform
 - New defconfig for buildroot

* tag 'arc-v3.13-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-arcfpga] Add defconfig without initramfs location
  ARC: perf: ARC 700 PMU doesn't support sampling events
  ARC: Add documentation on DT binding for ARC700 PMU
  ARC: Add perf support for ARC700 cores
  ARC: [TB10x] Updates for GPIO and pinctrl

dec8e461

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 806dace6

由 Linus Torvalds 提交于 11月 19, 2013

Pull second set of s390 patches from Martin Schwidefsky:
 "The handling of the PCI hotplug notifications has been improved, the
  zfcp dumper can now detect the HSA size dynamically and the default
  install kernel has been changed to the compressed bzImage.  And two
  bug-fixes for scm and 3720"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: implement hotplug notifications
  s390/scm_block: do not hide eadm subchannel dependency
  s390/sclp: Consolidate early sclp init calls to sclp_early_detect()
  s390/sclp: Move early code from sclp_cmd.c to sclp_early.c
  s390/sclp: Determine HSA size dynamically for zfcpdump
  s390/sclp: Move declarations for sclp_sdias into separate header file
  s390/pci: implement pcibios_remove_bus
  s390/pci: improve handling of bus resources
  s390/3270: fix missing device_destroy() call
  s390/boot: Install bzImage as default kernel image

806dace6

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · cdc7ef89

由 Linus Torvalds 提交于 11月 19, 2013

Pull UML changes from Richard Weinberger:
 "This pile contains a nice defconfig cleanup, a rewritten stack
  unwinder and various cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Remove unused declarations from <as-layout.h>
  um: remove used STDIO_CONSOLE Kconfig param
  um/vdso: add .gitignore for a couple of targets
  arch/um: make it work with defconfig and x86_64
  um: Make kstack_depth_to_print conform to arch/x86
  um: Get rid of thread_struct->saved_task
  um: Make stack trace reliable against kernel mode faults
  um: Rewrite show_stack()

cdc7ef89

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9066d9b2

由 Linus Torvalds 提交于 11月 19, 2013

Pull x86 fix from Ingo Molnar:
 "A modular build fix for certain .config's"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Export 'boot_cpu_physical_apicid' to modules

9066d9b2

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 40071626

由 Linus Torvalds 提交于 11月 19, 2013

Pull irq cleanups from Ingo Molnar:
 "This is a multi-arch cleanup series from Thomas Gleixner, which we
  kept to near the end of the merge window, to not interfere with
  architecture updates.

  This series (motivated by the -rt kernel) unifies more aspects of IRQ
  handling and generalizes PREEMPT_ACTIVE"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  preempt: Make PREEMPT_ACTIVE generic
  sparc: Use preempt_schedule_irq
  ia64: Use preempt_schedule_irq
  m32r: Use preempt_schedule_irq
  hardirq: Make hardirq bits generic
  m68k: Simplify low level interrupt handling code
  genirq: Prevent spurious detection for unconditionally polled interrupts

40071626

19 11月, 2013 5 次提交

seq_file: always clear m->count when we free m->buf · 801a7605

由 Al Viro 提交于 11月 19, 2013

Once we'd freed m->buf, m->count should become zero - we have no valid
contents reachable via m->buf.
Reported-by: NCharley (Hao Chuan) Chu <charley.chu@broadcom.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

801a7605

Merge git://www.linux-watchdog.org/linux-watchdog · 27b5c3f3

由 Linus Torvalds 提交于 11月 18, 2013

Pull watchdog changes from Wim Van Sebroeck:
 - addition of MOXA ART watchdog driver (moxart_wdt)
 - addition of CSR SiRFprimaII and SiRFatlasVI watchdog driver
   (sirfsoc_wdt)
 - addition of ralink watchdog driver (rt2880_wdt)
 - various fixes and cleanups (__user annotation, ioctl return codes,
   removal of redundant of_match_ptr, removal of unnecessary
   amba_set_drvdata(), use allocated buffer for usb_control_msg, ...)
 - removal of MODULE_ALIAS_MISCDEV statements
 - watchdog related DT bindings
 - first set of improvements on the w83627hf_wdt driver

* git://www.linux-watchdog.org/linux-watchdog: (26 commits)
  watchdog: w83627hf: Use helper functions to access superio registers
  watchdog: w83627hf: Enable watchdog device only if not already enabled
  watchdog: w83627hf: Enable watchdog only once
  watchdog: w83627hf: Convert to watchdog infrastructure
  watchdog: omap_wdt: raw read and write endian fix
  watchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE
  watchdog: pcwd_usb: overflow in usb_pcwd_send_command()
  watchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe()
  watchdog: watchdog_core: Fix a trivial typo
  watchdog: dw: Enable OF support for DW watchdog timer
  watchdog: Get rid of MODULE_ALIAS_MISCDEV statements
  watchdog: ts72xx_wdt: Propagate return value from timeout_to_regval
  watchdog: pcwd_usb: Use allocated buffer for usb_control_msg
  watchdog: sp805_wdt: Remove unnecessary amba_set_drvdata()
  watchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI
  watchdog: Remove redundant of_match_ptr
  watchdog: ts72xx_wdt: cleanup return codes in ioctl
  documentation/devicetree: Move DT bindings from gpio to watchdog
  watchdog: add ralink watchdog driver
  watchdog: Add MOXA ART watchdog driver
  ...

27b5c3f3

Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 13509c3a

由 Linus Torvalds 提交于 11月 18, 2013

Pull i2c changes from Wolfram Sang:
 - new drivers for exynos5, bcm kona, and st micro
 - bigger overhauls for drivers mxs and rcar
 - typical driver bugfixes, cleanups, improvements
 - got rid of the superfluous 'driver' member in i2c_client struct This
   touches a few drivers in other subsystems.  All acked.

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
  i2c: bcm-kona: fix error return code in bcm_kona_i2c_probe()
  i2c: i2c-eg20t: do not print error message in syslog if no ACK received
  i2c: bcm-kona: Introduce Broadcom I2C Driver
  i2c: cbus-gpio: Fix device tree binding
  i2c: wmt: add missing clk_disable_unprepare() on error
  i2c: designware: add new ACPI IDs
  i2c: i801: Add Device IDs for Intel Wildcat Point-LP PCH
  i2c: exynos5: Remove incorrect clk_disable_unprepare
  i2c: i2c-st: Add ST I2C controller
  i2c: exynos5: add High Speed I2C controller driver
  i2c: rcar: fixup rcar type naming
  i2c: scmi: remove some bogus NULL checks
  i2c: sh_mobile & rcar: Enable the driver on all ARM platforms
  i2c: sh_mobile: Convert to clk_prepare/unprepare
  i2c: mux: gpio: use reg value for i2c_add_mux_adapter
  i2c: mux: gpio: use gpio_set_value_cansleep()
  i2c: Include linux/of.h header
  i2c: mxs: Fix PIO mode on i.MX23
  i2c: mxs: Rework the PIO mode operation
  i2c: mxs: distinguish i.MX23 and i.MX28 based I2C controller
  ...

13509c3a

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · 1ea406c0

由 Linus Torvalds 提交于 11月 18, 2013

Pull infiniband/rdma updates from Roland Dreier:
 - Re-enable flow steering verbs with new improved userspace ABI
 - Fixes for slow connection due to GID lookup scalability
 - IPoIB fixes
 - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
 - Further improvements to SRP error handling
 - Add new transport type for Cisco usNIC

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
  IB/core: Re-enable create_flow/destroy_flow uverbs
  IB/core: extended command: an improved infrastructure for uverbs commands
  IB/core: Remove ib_uverbs_flow_spec structure from userspace
  IB/core: Use a common header for uverbs flow_specs
  IB/core: Make uverbs flow structure use names like verbs ones
  IB/core: Rename 'flow' structs to match other uverbs structs
  IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
  IB/ucma: Convert use of typedef ctl_table to struct ctl_table
  IB/cm: Convert to using idr_alloc_cyclic()
  IB/mlx5: Fix page shift in create CQ for userspace
  IB/mlx4: Fix device max capabilities check
  IB/mlx5: Fix list_del of empty list
  IB/mlx5: Remove dead code
  IB/core: Encorce MR access rights rules on kernel consumers
  IB/mlx4: Fix endless loop in resize CQ
  RDMA/cma: Remove unused argument and minor dead code
  RDMA/ucma: Discard events for IDs not yet claimed by user space
  IB/core: Add Cisco usNIC rdma node and transport types
  RDMA/nes: Remove self-assignment from nes_query_qp()
  IB/srp: Report receive errors correctly
  ...

1ea406c0

Merge tag 'for-v3.13' of git://git.infradead.org/battery-2.6 · a709bd58

由 Linus Torvalds 提交于 11月 18, 2013

Pull battery updates from Anton Vorontsov:
 "Highlights:
   - A new driver for TI BQ24735 Battery Chargers, courtesy of NVidia.
   - Device tree bindings for TWL4030 chips.
   - Random fixes and cleanups"

* tag 'for-v3.13' of git://git.infradead.org/battery-2.6:
  pm2301-charger: Remove unneeded NULL checks
  twl4030_charger: Add devicetree support
  power_supply: Fix documentation for TEMP_*ALERT* properties
  max17042_battery: Support regmap to access device's registers
  max17042_battery: Use SIMPLE_DEV_PM_OPS
  charger-manager : Replace kzalloc to devm_kzalloc and remove uneccessary code
  bq2415x_charger: Fix max battery regulation voltage
  tps65090-charger: Use "IS_ENABLED(CONFIG_OF)" for DT code
  tps65090-charger: Drop devm_free_irq of devm_ allocated irq
  power_supply: Add support for bq24735 charger
  pm2301-charger: Staticize pm2xxx_charger_die_therm_mngt
  pm2301-charger: Check return value of regulator_enable
  ab8500-charger: Remove redundant break
  ab8500-charger: Check return value of regulator_enable
  isp1704_charger: Fix driver to work with changes introduced in v3.5

a709bd58