提交 · d0c21d43a5a12aaebb1e42e10cf78e6491fc9e5a · openeuler / Kernel

23 5月, 2014 5 次提交

bonding: Send ALB learning packets using the right source · d0c21d43

由 Vlad Yasevich 提交于 5月 21, 2014

ALB learning packets are currentlyalways sent using the slave mac
address for all vlans configured on top of bond.   This is not always
correct, as vlans may change their mac address.
This patch introduced a concept of strict matching where the
source of learning packets can either strictly match the address
passed in, or it can determine a more correct address to use.

There are 3 casese to consider:
  1) Switchover.  In this case, we have a new active slave and we need
     tell the switch about all addresses available on the slave.
  2) Monitor.  We'll periodically refresh learning info for all slaves.
     In this case, we refresh all addresses for current active, and just
     the slave address for other slaves.
  3) Teaching of disabled adddress.  This happens as part of the
     failover and in this case, we alwyas to use just the address
     provided.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0c21d43

bonding: Don't assume 802.1Q when sending alb learning packets. · d6b694c0

由 Vlad Yasevich 提交于 5月 21, 2014

TLB/ALB learning packets always assume 802.1Q vlan protocol, but
that is no longer the case since we now have support for Q-in-Q
on top of bonding.  Pass the vlan protocol to alb_send_lp_vid()
so that the packets are properly tagged.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Acked-by: NVeaceslav Falico <vfalico@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6b694c0

Merge tag 'linux-can-fixes-for-3.15-20140521' of git://gitorious.org/linux-can/linux-can · a3431acf

由 David S. Miller 提交于 5月 22, 2014

Marc Kleine-Budde says:

====================
pull-request: can 2014-05-21

this is a pull request for net/master, for the v3.15 release cycle, with a
single patch. Christopher R. Baker found a use after free during unloading of
the peak_pci driver. This is fixes in a patch by Stephane Grosjean.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3431acf

net: doc: Update references to skb->rxhash · b0db5cdf

由 Tobias Klauser 提交于 5月 20, 2014

In commit 61b905da ("net: Rename skb->rxhash to skb->hash"), skb->rxhash
was renamed to skb->hash. Update references in Documentation
accordingly.
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0db5cdf

stmmac: Remove unbalanced clk_disable call · 89df20d9

由 Hans de Goede 提交于 5月 20, 2014

The stmmac_open call was calling clk_disable_unprepare on phy init
failure, but it never calls clk_prepare_enable, this causes
a WARN_ON in the clk framework to trigger if for some reason phy init
fails.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: NChen-Yu Tsai <wens@csie.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89df20d9

22 5月, 2014 2 次提交

ipv6: gro: fix CHECKSUM_COMPLETE support · 4de462ab

由 Eric Dumazet 提交于 5月 19, 2014

When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling
broke on GRE+IPv6 because we did not update/use the appropriate csum :

GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of
skb->csum

Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens
at the first level (ethernet device) instead of being done in gre
tunnel. Native IPv6+TCP is still properly aggregated.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4de462ab

net_sched: fix an oops in tcindex filter · bf63ac73

由 Cong Wang 提交于 5月 19, 2014

Kelly reported the following crash:

        IP: [<ffffffff817a993d>] tcf_action_exec+0x46/0x90
        PGD 3009067 PUD 300c067 PMD 11ff30067 PTE 800000011634b060
        Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
        CPU: 1 PID: 639 Comm: dhclient Not tainted 3.15.0-rc4+ #342
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff8801169ecd00 ti: ffff8800d21b8000 task.ti: ffff8800d21b8000
        RIP: 0010:[<ffffffff817a993d>]  [<ffffffff817a993d>] tcf_action_exec+0x46/0x90
        RSP: 0018:ffff8800d21b9b90  EFLAGS: 00010283
        RAX: 00000000ffffffff RBX: ffff88011634b8e8 RCX: ffff8800cf7133d8
        RDX: ffff88011634b900 RSI: ffff8800cf7133e0 RDI: ffff8800d210f840
        RBP: ffff8800d21b9bb0 R08: ffffffff8287bf60 R09: 0000000000000001
        R10: ffff8800d2b22b24 R11: 0000000000000001 R12: ffff8800d210f840
        R13: ffff8800d21b9c50 R14: ffff8800cf7133e0 R15: ffff8800cad433d8
        FS:  00007f49723e1840(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffff88011634b8f0 CR3: 00000000ce469000 CR4: 00000000000006e0
        Stack:
         ffff8800d2170188 ffff8800d210f840 ffff8800d2171b90 0000000000000000
         ffff8800d21b9be8 ffffffff817c55bb ffff8800d21b9c50 ffff8800d2171b90
         ffff8800d210f840 ffff8800d21b0300 ffff8800d21b9c50 ffff8800d21b9c18
        Call Trace:
         [<ffffffff817c55bb>] tcindex_classify+0x88/0x9b
         [<ffffffff817a7f7d>] tc_classify_compat+0x3e/0x7b
         [<ffffffff817a7fdf>] tc_classify+0x25/0x9f
         [<ffffffff817b0e68>] htb_enqueue+0x55/0x27a
         [<ffffffff817b6c2e>] dsmark_enqueue+0x165/0x1a4
         [<ffffffff81775642>] __dev_queue_xmit+0x35e/0x536
         [<ffffffff8177582a>] dev_queue_xmit+0x10/0x12
         [<ffffffff818f8ecd>] packet_sendmsg+0xb26/0xb9a
         [<ffffffff810b1507>] ? __lock_acquire+0x3ae/0xdf3
         [<ffffffff8175cf08>] __sock_sendmsg_nosec+0x25/0x27
         [<ffffffff8175d916>] sock_aio_write+0xd0/0xe7
         [<ffffffff8117d6b8>] do_sync_write+0x59/0x78
         [<ffffffff8117d84d>] vfs_write+0xb5/0x10a
         [<ffffffff8117d96a>] SyS_write+0x49/0x7f
         [<ffffffff8198e212>] system_call_fastpath+0x16/0x1b

This is because we memcpy struct tcindex_filter_result which contains
struct tcf_exts, obviously struct list_head can not be simply copied.
This is a regression introduced by commit 33be6271
(net_sched: act: use standard struct list_head).

It's not very easy to fix it as the code is a mess:

       if (old_r)
               memcpy(&cr, r, sizeof(cr));
       else {
               memset(&cr, 0, sizeof(cr));
               tcf_exts_init(&cr.exts, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
       }
       ...
       tcf_exts_change(tp, &cr.exts, &e);
       ...
       memcpy(r, &cr, sizeof(cr));

the above code should equal to:

        tcindex_filter_result_init(&cr);
        if (old_r)
               cr.res = r->res;
        ...
        if (old_r)
               tcf_exts_change(tp, &r->exts, &e);
        else
               tcf_exts_change(tp, &cr.exts, &e);
        ...
        r->res = cr.res;

after this change, since there is no need to copy struct tcf_exts.

And it also fixes other places zero'ing struct's contains struct tcf_exts.

Fixes: commit 33be6271 (net_sched: act: use standard struct list_head)
Reported-by: NKelly Anderson <kelly@xilka.com>
Tested-by: NKelly Anderson <kelly@xilka.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf63ac73

21 5月, 2014 4 次提交

can: peak_pci: prevent use after free at netdev removal · 0b5a958c

由 Stephane Grosjean 提交于 5月 20, 2014

As remarked by Christopher R. Baker in his post at

http://marc.info/?l=linux-can&m=139707295706465&w=2

there's a possibility for an use after free condition at device removal.

This simplified patch introduces an additional variable to prevent the issue.
Thanks for catching this.

Cc: linux-stable <stable@vger.kernel.org>
Reported-by: NChristopher R. Baker <cbaker@rec.ri.cmu.edu>
Signed-off-by: NStephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

0b5a958c

ip_tunnel: Initialize the fallback device properly · 78ff4be4

由 Steffen Klassert 提交于 5月 19, 2014

We need to initialize the fallback device to have a correct mtu
set on this device. Otherwise the mtu is set to null and the device
is unusable.

Fixes: fd58156e ("IPIP: Use ip-tunneling code.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78ff4be4

Merge tag 'linux-can-fixes-for-3.15-20140519' of git://gitorious.org/linux-can/linux-can · d8d33c3b

由 David S. Miller 提交于 5月 21, 2014

Marc Kleine-Budde says:

====================
pull-request: can 2014-05-19

this is a pull request for net/master, for the v3.15 release cycle,
with a single patch.

Oliver Hartkopp's patch removes a Kconfig option in the c_can driver,
which was added as a workaround during the v3.15 development. With all
cleanup patches this workaround is not needed anymore.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8d33c3b

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · d050de60

由 David S. Miller 提交于 5月 21, 2014

Pablo Neira Ayuso says:

====================
Netfilter/nftables fixes for net

The following patchset contains nftables fixes for your net tree, they
are:

1) Fix crash when using the goto action in a rule by making sure that
   we always fall back on the base chain. Otherwise, this may try to
   access the counter memory area of non-base chains, which does not
   exists.

2) Fix several aspects of the rule tracing that are currently broken:

   * Reset rule number counter after goto/jump action, otherwise the
     tracing reports a bogus rule number.
   * Fix tracing of the goto action.
   * Fix bogus rule number counter after goto.
   * Fix missing return trace after finishing the walk through the
     non-base chain.
   * Fix missing trace when matching non-terminal rule.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d050de60

20 5月, 2014 1 次提交

vlan: Fix build error wth vlan_get_encap_level() · e1618d46

由 Vlad Yasevich 提交于 5月 20, 2014

The new function vlan_get_encap_level() uses vlan_dev_priv()
which is only conditionally avaialble when VLAN support is
enabled.  Make vlan_get_encap_level() conditionally available
as well.

Fixes: 44a40855 ("bonding: Fix stacked device detection in arp monitoring")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1618d46

19 5月, 2014 3 次提交

can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option · 524369e2

由 Oliver Hartkopp 提交于 5月 06, 2014

In 2b9aecdc ("can: c_can: Disable rx split as workaround") a new Kconfig
option was introduced as a workaround. The tests performed by Alexander Stein
confirmed this option to be obsolete with all the other cleanups and fixes
that had been discussed that time:
http://marc.info/?l=linux-can&m=139746476821294&w=2

Both (author and tester) agreed to remove this Kconfig option again:
http://marc.info/?l=linux-can&m=139883820714229&w=2

As some more cleanups took place since then a simple revert is not possible.
This patch removes the entire option as it would behave when disabled.
Further beautification’s can be done later.
Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Tested-by: NAlexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

524369e2

MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. · 4f337ed5

由 Jesse Gross 提交于 5月 16, 2014

Pravin will be maintaining Open vSwitch going forward.

CC: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f337ed5

bnx2x: Convert return 0 to return rc · 02948344

由 Joe Perches 提交于 5月 15, 2014

These "return 0;" uses seem wrong as there are
rc variables where error return values are set
but unused.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NDmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02948344

17 5月, 2014 20 次提交

Merge branch 'bond_stacked_vlans' · a8d0d841

由 David S. Miller 提交于 5月 16, 2014

Vlad Yasevich says:

====================
Fixed stacked vlan usage on top of bonds

Bonding device driver now support q-in-q on top for bonds.  There are
a few issues here though.

First, when arp monitoring is used, bonding driver will not correctly
tag traffic if the source of the arp device was configured on top of
q-in-q.  It may also incorrectly pick the wrong vlan id if the ordering
of that upper devices isn't as expected (there is no guarntee on ordering).

Second, the alb/tlb may use what would be considered 'inner' vlans in
its learning announcements, as it simply announces all vlans configured
on top of the bond without regard for encapsulation/stacking.

This series fixes the above 2 issues.  This series also depends on the
functionality introduced in
	http://patchwork.ozlabs.org/patch/349766/

Since v1:
  - Changed how patch1 verifies the device path.  We no longer use the
    _all_upper version of the function.  We find the path and if it was
    found, then collect the vlan information.
  - Use the constant to devine maximum vlan nest level support on top
    of bonding.  This can be changed if 2 is too low.
  - Inlude patch2 into the series.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8d0d841

bonding: Fix alb mode to only use first level vlans. · f60c3704

由 Vlad Yasevich 提交于 5月 16, 2014

ALB/TLB learning packets use all vlans configured on top
of the bond.  This ends up being incorrect if we have a stack
of vlans on top of the bond.  ALB/TLB should only use
first level/outer most vlans in its announcements.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f60c3704

bonding: Fix stacked device detection in arp monitoring · 44a40855

由 Vlad Yasevich 提交于 5月 16, 2014

Prior to commit fbd929f2
	bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans.  Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device.  However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note:  This patch limites the number of stacked VLANs to 2,
just like before.  The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic.  This is no longer
possible.

Fixes: fbd929f2 (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Patric McHardy <kaber@trash.net>
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44a40855

Merge branch 'stacked_netdevice_locking' · 6bd64ac0

由 David S. Miller 提交于 5月 16, 2014

Vlad Yasevich says:

====================
Fix lockdep issues with stacked devices

Recent commit dc8eaaa0
    vlan: Fix lockdep warning when vlan dev handle notification

attempted to solve lockdep issues with vlans where multiple
vlans were stacked.  However, the code does not work correctly
when the vlan stack is interspersed with other devices in between
the vlans.  Additionally, similar lockdep issues show up with other
devices.

This series provides a generic way to solve these issue for any
devices that can be stacked.  It also addresses the concern for
vlan and macvlan devices.  I am not sure whether it makes sense
to do so for other types like team, vxlan, and bond.

Thanks
-vlad

Since v2:
  - Remove rcu variants from patch1, since that function is called
    only under rtnl.
  - Fix whitespace problems reported by checkpatch

Since v1:
  - Fixed up a goofed-up rebase.
    * is_vlan_dev() should be bool and that change belongs in patch3.
    * patch4 should not have any vlan changes in it.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6bd64ac0

macvlan: Fix lockdep warnings with stacked macvlan devices · c674ac30

由 Vlad Yasevich 提交于 5月 16, 2014

Macvlan devices try to avoid stacking, but that's not always
successfull or even desired.  As an example, the following
configuration is perefectly legal and valid:

eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1

However, this configuration produces the following lockdep
trace:
[  115.620418] ======================================================
[  115.620477] [ INFO: possible circular locking dependency detected ]
[  115.620516] 3.15.0-rc1+ #24 Not tainted
[  115.620540] -------------------------------------------------------
[  115.620577] ip/1704 is trying to acquire lock:
[  115.620604]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.620686]
but task is already holding lock:
[  115.620723]  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[  115.620795]
which lock already depends on the new lock.

[  115.620853]
the existing dependency chain (in reverse order) is:
[  115.620894]
-> #1 (&macvlan_netdev_addr_lock_key){+.....}:
[  115.620935]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.620974]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621019]        [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
[  115.621066]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621105]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621143]        [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[  115.621174]
-> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
[  115.621174]        [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[  115.621174]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.621174]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621174]        [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.621174]        [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[  115.621174]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621174]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621174]        [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[  115.621174]
other info that might help us debug this:

[  115.621174]  Possible unsafe locking scenario:

[  115.621174]        CPU0                    CPU1
[  115.621174]        ----                    ----
[  115.621174]   lock(&macvlan_netdev_addr_lock_key);
[  115.621174]                                lock(&vlan_netdev_addr_lock_key/1);
[  115.621174]                                lock(&macvlan_netdev_addr_lock_key);
[  115.621174]   lock(&vlan_netdev_addr_lock_key/1);
[  115.621174]
 *** DEADLOCK ***

[  115.621174] 2 locks held by ip/1704:
[  115.621174]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
[  115.621174]  #1:  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[  115.621174]
stack backtrace:
[  115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
[  115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
[  115.621174]  ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
[  115.621174]  ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
[  115.621174]  0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
[  115.621174] Call Trace:
[  115.621174]  [<ffffffff816ee20c>] dump_stack+0x4d/0x66
[  115.621174]  [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
[  115.621174]  [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[  115.621174]  [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
[  115.621174]  [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[  115.621174]  [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621174]  [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621174]  [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]  [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]  [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]  [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
[  115.621174]  [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]  [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
[  115.621174]  [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]  [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
[  115.621174]  [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]  [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
[  115.621174]  [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
[  115.621174]  [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
[  115.621174]  [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]  [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]  [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]  [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]  [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[  115.621174]  [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
[  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[  115.621174]  [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
[  115.621174]  [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]  [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
[  115.621174]  [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
[  115.621174]  [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
[  115.621174]  [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
[  115.621174]  [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
[  115.621174]  [<ffffffff8120a284>] ? mntput+0x24/0x40
[  115.621174]  [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]  [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]  [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b

Fix this by correctly providing macvlan lockdep class.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c674ac30

vlan: Fix lockdep warning with stacked vlan devices. · d38569ab

由 Vlad Yasevich 提交于 5月 16, 2014

This reverts commit dc8eaaa0.
	vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device.  This way we can support configurations where
vlans are interspersed with other devices:
  bond -> vlan -> macvlan -> vlan
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d38569ab

net: Allow for more then a single subclass for netif_addr_lock · 25175ba5

由 Vlad Yasevich 提交于 5月 16, 2014

Currently netif_addr_lock_nested assumes that there can be only
a single nesting level between 2 devices.  However, if we
have multiple devices of the same type stacked, this fails.
For example:
 eth0 <-- vlan0.10 <-- vlan0.10.20

A more complicated configuration may stack more then one type of
device in different order.
Ex:
  eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1

This patch adds an ndo_* function that allows each stackable
device to report its nesting level.  If the device doesn't
provide this function default subclass of 1 is used.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25175ba5

net: Find the nesting level of a given device by type. · 4085ebe8

由 Vlad Yasevich 提交于 5月 16, 2014

Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
  eth0 <- vlan0.10 <- macvlan0 <- vlan1.20

The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4085ebe8

net: gro: make sure skb->cb[] initial content has not to be zero · 29e98242

由 Eric Dumazet 提交于 5月 16, 2014

Starting from linux-3.13, GRO attempts to build full size skbs.

Problem is the commit assumed one particular field in skb->cb[]
was clean, but it is not the case on some stacked devices.

Timo reported a crash in case traffic is decrypted before
reaching a GRE device.

Fix this by initializing NAPI_GRO_CB(skb)->last at the right place,
this also removes one conditional.

Thanks a lot to Timo for providing full reports and bisecting this.

Fixes: 8a29111c ("net: gro: allow to build full sized skb")
Bisected-by: NTimo Teras <timo.teras@iki.fi>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29e98242

ipv4: ip_tunnels: disable cache for nbma gre tunnels · 22fb22ea

由 Timo Teräs 提交于 5月 16, 2014

The connected check fails to check for ip_gre nbma mode tunnels
properly. ip_gre creates temporary tnl_params with daddr specified
to pass-in the actual target on per-packet basis from neighbor
layer. Detect these tunnels by inspecting the actual tunnel
configuration.

Minimal test case:
 ip route add 192.168.1.1/32 via 10.0.0.1
 ip route add 192.168.1.2/32 via 10.0.0.2
 ip tunnel add nbma0 mode gre key 1 tos c0
 ip addr add 172.17.0.0/16 dev nbma0
 ip link set nbma0 up
 ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0
 ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0
 ping 172.17.0.1
 ping 172.17.0.2

The second ping should be going to 192.168.1.2 and head 10.0.0.2;
but cached gre tunnel level route is used and it's actually going
to 192.168.1.1 via 10.0.0.1.

The lladdr's need to go to separate dst for the bug to trigger.
Test case uses separate route entries, but this can also happen
when the route entry is same: if there is a nexthop exception or
the GRE tunnel is IPsec'ed in which case the dst points to xfrm
bundle unique to the gre lladdr.

Fixes: 7d442fab ("ipv4: Cache dst in tunnels")
Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22fb22ea

net/dsa/dsa.c: increment chip_index during of_node handling on dsa_of_probe() · d1c0b471

由 Fabian Godehardt 提交于 5月 16, 2014

Adding more than one chip on device-tree currently causes the probing
routine to always use the first chips data pointer.
Signed-off-by: NFabian Godehardt <fg@emlix.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1c0b471

net: ipv6: make "ip -6 route get mark xyz" work. · 2e47b291

由 Lorenzo Colitti 提交于 5月 15, 2014

Currently, "ip -6 route get mark xyz" ignores the mark passed in
by userspace. Make it honour the mark, just like IPv4 does.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e47b291

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge · 2f67cc87

由 David S. Miller 提交于 5月 16, 2014

Include changes:
- fix NULL dereference in batadv_orig_hardif_seq_print_text()
- fix reference counting imbalance when using fragmentation
- avoid access to orig_node objects after they have been free'd
- fix local TT check for outgoing arp requests in DAT

2f67cc87

xen-netback: fix race between napi_complete() and interrupt handler · 0d08fceb

由 David Vrabel 提交于 5月 16, 2014

When the NAPI budget was not all used, xenvif_poll() would call
napi_complete() /after/ enabling the interrupt.  This resulted in a
race between the napi_complete() and the napi_schedule() in the
interrupt handler.  The use of local_irq_save/restore() avoided by
race iff the handler is running on the same CPU but not if it was
running on a different CPU.

Fix this properly by calling napi_complete() before reenabling
interrupts (in the xenvif_napi_schedule_or_enable_irq() call).
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NWei Liu <wei.liu2@citrix.com>
Acked-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d08fceb

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 202630b4

由 David S. Miller 提交于 5月 16, 2014

John W. Linville says:

====================
pull request: wireless 2014-05-15

Please pull this batch of fixes for the 3.15 stream...

For the mac80211 bits, Johannes says:

"One fix is to get better VHT performance and the other fixes tracing
garbage or other potential issues with the interface name tracing."

And...

"This has a fix from Emmanuel for a problem I failed to fix - when
association is in progress then it needs to be cancelled while
suspending (I had fixed the same for authentication). Also included a
fix from myself for a userspace API problem that hit the iw tool and a
fix to the remain-on-channel framework."

For the iwlwifi bits, Emmanuel says:

"Alex fixes the scan by disabling the fragmented scan. David prevents
scan offload while associated, the firmware seems not to like it. I
fix a stupid bug I made in BT Coex, and fix a bad #ifdef clause in rate
scaling.  Along with that there is a fix for a NULL pointer exception
that can happen if we load the driver and our ISR gets called because
the interrupt line is shared. The fix has been tested by the reporter."

And...

"We have here a fix from David Spinadel that makes a previous fix more
complete, and an off-by-one issue fixed by Eliad in the same area.
I fix the monitor that broke on the way."

Beyond that...

Daniel Kim's one-liner fixes a brcmfmac regression caused by a typo
in an earlier commit..

Rajkumar Manoharan fixes an ath9k oops reported by David Herrmann.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

202630b4

af_rxrpc: Fix XDR length check in rxrpc key demarshalling. · fde0133b

由 Nathaniel W Filardo 提交于 5月 15, 2014

There may be padding on the ticket contained in the key payload, so just ensure
that the claimed token length is large enough, rather than exactly the right
size.
Signed-off-by: NNathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fde0133b

net: phy: resume phydev when going to RESUMING · 6e14a5ee

由 Zhangfei Gao 提交于 5月 15, 2014

With commit be9dad1f ("net: phy: suspend phydev when going
to HALTED"), an unused PHY device will be put in a low-power mode
using BMCR_PDOWN. Some Ethernet drivers might be calling phy_start()
and phy_stop() from ndo_open and ndo_close() respectively, while
calling phy_connect() and phy_disconnect() from probe and remove.
In such a case, the PHY will be powered down during the phy_stop()
call, but will fail to be powered up in phy_start().
This patch fixes this scenario.
Signed-off-by: NJiancheng Xue <xuejiancheng@huawei.com>
Signed-off-by: NZhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e14a5ee

Merge branch 'mlx4-net' · 0c2e3fa9

由 David S. Miller 提交于 5月 16, 2014

Or Gerlitz says:

====================
mlx4: Fix VF MAC address change under RoCE usage

This short series provides proper handling for the case where a
VF netdevice change their MAC address under a RoCE use case. The code
it deals with was introduced in 3.15-rc1

Prior to this series the source MAC used for the VM RoCE CM
packets remains as before the MAC modification. Hence RoCE CM
packets sent by the VF will not carry the same source MAC
address as the non-CM packets.

Earlier 3.15-rc commit f24f790f "net/mlx4_core: Load the Eth driver
first" handled just one instance of the problem, but this one
provides a more generic and proper solution which covers all
cases of VF mac change.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c2e3fa9

IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes · 9433c188

由 Matan Barak 提交于 5月 15, 2014

When we receive a netdev event indicating a netdev change and/or
a netdev address change, we must change the MAC index used by the
proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the
VF will not carry the same source MAC address as the non-CM packets.

We use the UPDATE_QP command to perform this change.

In order to avoid modifying a QP context based on netdev event,
while the driver attempts to destroy this QP (e.g either the mlx4_ib
or ib_mad modules are unloaded), we use mutex locking in both flows.

Since the relevant mlx4 proxy GSI QP is created indirectly by the
mad module when they create their GSI QP, the mlx4 didn't need to
keep track on that QP prior to this change.

Now, when QP modifications are needed to this QP from within the
driver, we added refernece to it.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9433c188

net/mlx4_core: Add UPDATE_QP SRIOV wrapper support · ce8d9e0d

由 Matan Barak 提交于 5月 15, 2014

This patch adds UPDATE_QP SRIOV wrapper support.

The mechanism is a general one, but currently only source MAC
index changes are allowed for VFs.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce8d9e0d

16 5月, 2014 5 次提交

bonding: fix out of range parameters for bond_intmax_tbl · 81c70806

由 Nikolay Aleksandrov 提交于 5月 15, 2014

I've missed to add a NULL entry to the bond_intmax_tbl when I introduced
it with the conversion of arp_interval so add it now.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>

Fixes: 7bdb04ed ("bonding: convert arp_interval to use the new option API")
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NVeaceslav Falico <vfalico@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81c70806

xen-netback: Fix grant ref resolution in RX path · 58375744

由 Zoltan Kiss 提交于 5月 15, 2014

The original series for reintroducing grant mapping for netback had a patch [1]
to handle receiving of packets from an another VIF. Grant copy on the receiving
side needs the grant ref of the page to set up the op.
The original patch assumed (wrongly) that the frags array haven't changed. In
the case reported by Sander, the sending guest sent a packet where the linear
buffer and the first frag were under PKT_PROT_LEN (=128) bytes.
xenvif_tx_submit() then pulled up the linear area to 128 bytes, and ditched the
first frag. The receiving side had an off-by-one problem when gathered the grant
refs.
This patch fixes that by checking whether the actual frag's page pointer is the
same as the page in the original frag list. It can handle any kind of changes on
the original frags array, like:
- removing granted frags from the array at any point
- adding local pages to the frags list anywhere
- reordering the frags
It's optimized to the most common case, when there is 1:1 relation between the
frags and the list, plus works optimal when frags are removed from the end or
the beginning.

[1]: 3e2234: xen-netback: Handle foreign mapped pages on the guest RX path
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: NZoltan Kiss <zoltan.kiss@citrix.com>
Acked-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58375744

ipv6: update Destination Cache entries when gateway turn into host · be7a010d

由 Duan Jiong 提交于 5月 15, 2014

RFC 4861 states in 7.2.5:

	The IsRouter flag in the cache entry MUST be set based on the
         Router flag in the received advertisement.  In those cases
         where the IsRouter flag changes from TRUE to FALSE as a result
         of this update, the node MUST remove that router from the
         Default Router List and update the Destination Cache entries
         for all destinations using that neighbor as a router as
         specified in Section 7.3.3.  This is needed to detect when a
         node that is used as a router stops forwarding packets due to
         being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().
Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be7a010d

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · f895f0cf

由 David S. Miller 提交于 5月 15, 2014

Conflicts:
	net/ipv4/ip_vti.c

Steffen Klassert says:

====================
pull request (net): ipsec 2014-05-15

This pull request has a merge conflict in net/ipv4/ip_vti.c
between commit 8d89dcdf ("vti: don't allow to add the same
tunnel twice") and commit a3245236  ("vti4:Don't count header
length twice"). It can be solved like it is done in linux-next.

1) Fix a ipv6 xfrm output crash when a packet is rerouted
   by netfilter to not use IPsec.

2) vti4 counts some header lengths twice leading to an incorrect
   device mtu. Fix this by counting these headers only once.

3) We don't catch the case if an unsupported protocol is submitted
   to the xfrm protocol handlers, this can lead to NULL pointer
   dereferences. Fix this by adding the appropriate checks.

4) vti6 may unregister pernet ops twice on init errors.
   Fix this by removing one of the calls to do it only once.
   From Mathias Krause.

5) Set the vti tunnel mark before doing a lookup in the error
   handlers. Otherwise we don't find the correct xfrm state.
====================

The conflict in ip_vti.c was simple, 'net' had a commit
removing a line from vti_tunnel_init() and this tree
being merged had a commit adding a line to the same
location.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f895f0cf

net: phy: Don't call phy_resume if phy_init_hw failed · b394745d

由 Guenter Roeck 提交于 5月 14, 2014

After the call to phy_init_hw failed in phy_attach_direct, phy_detach is called
to detach the phy device from its network device. If the attached driver is a
generic phy driver, this also detaches the driver. Subsequently phy_resume
is called, which assumes without checking that a driver is attached to the
device. This will result in a crash such as

Unable to handle kernel paging request for data at address 0xffffffffffffff90
Faulting instruction address: 0xc0000000003a0e18
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0000000003a0e18] .phy_attach_direct+0x68/0x17c
LR [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c
Call Trace:
[c0000003fc0475d0] [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c (unreliable)
[c0000003fc047670] [c0000000003a0ff8] .phy_connect_direct+0x28/0x98
[c0000003fc047700] [c0000000003f0074] .of_phy_connect+0x4c/0xa4

Only call phy_resume if phy_init_hw was successful.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b394745d

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功