提交 · bb25c3855a12cc58e33cd7ee9b69943790fe35f7 · openanolis / cloud-kernel

21 12月, 2017 11 次提交

tipc: remove joining group member from congested list · bb25c385

由 Jon Maloy 提交于 12月 20, 2017

When we receive a JOIN message from a peer member, the message may
contain an advertised window value ADV_IDLE that permits removing the
member in question from the tipc_group::congested list. However, since
the removal has been made conditional on that the advertised window is
*not* ADV_IDLE, we miss this case. This has the effect that a sender
sometimes may enter a state of permanent, false, broadcast congestion.

We fix this by unconditinally removing the member from the congested
list before calling tipc_member_update(), which might potentially sort
it into the list again.
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb25c385

selftests: net: Adding config fragment CONFIG_NUMA=y · 1c8e77fb

由 Naresh Kamboju 提交于 12月 20, 2017

kernel config fragement CONFIG_NUMA=y is need for reuseport_bpf_numa.
Signed-off-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c8e77fb

Merge tag 'mlx5-fixes-2017-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 932f8c77

由 David S. Miller 提交于 12月 20, 2017

Saeed Mahameed says:

===================
Mellanox, mlx5 fixes 2017-12-19

The follwoing series includes some fixes for mlx5 core and etherent
driver.

Please pull and let me know if there is any problem.

This series doesn't introduce any conflict with the ongoing mlx5 for-next
submission.

For -stable:

kernels >= v4.7.y
    ("net/mlx5e: Fix possible deadlock of VXLAN lock")
    ("net/mlx5e: Add refcount to VXLAN structure")
    ("net/mlx5e: Prevent possible races in VXLAN control flow")
    ("net/mlx5e: Fix features check of IPv6 traffic")

kernels >= v4.9.y
    ("net/mlx5: Fix error flow in CREATE_QP command")
    ("net/mlx5: Fix rate limit packet pacing naming and struct")

kernels >= v4.13.y
    ("net/mlx5: FPGA, return -EINVAL if size is zero")

kernels >= v4.14.y
    ("Revert "mlx5: move affinity hints assignments to generic code")

All above patches apply and compile with no issues on corresponding -stable.
===================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

932f8c77

Merge branch 'cls_bpf-fix-offload-state-tracking-with-block-callbacks' · a8fcefe8

由 David S. Miller 提交于 12月 20, 2017

Jakub Kicinski says:

===================
cls_bpf: fix offload state tracking with block callbacks

After introduction of block callbacks classifiers can no longer track
offload state.  cls_bpf used to do that in an attempt to move common
code from drivers to the core.  Remove that functionality and fix
drivers.

The user-visible bug this is fixing is that trying to offload a second
filter would trigger a spurious DESTROY and in turn disable the already
installed one.
===================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8fcefe8

nfp: bpf: keep track of the offloaded program · d3f89b98

由 Jakub Kicinski 提交于 12月 19, 2017

After TC offloads were converted to callbacks we have no choice
but keep track of the offloaded filter in the driver.

The check for nn->dp.bpf_offload_xdp was a stop gap solution
to make sure failed TC offload won't disable XDP, it's no longer
necessary.  nfp_net_bpf_offload() will return -EBUSY on
TC vs XDP conflicts.

Fixes: 3f7889c4 ("net: sched: cls_bpf: call block callbacks for offload")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3f89b98

cls_bpf: fix offload assumptions after callback conversion · 102740bd

由 Jakub Kicinski 提交于 12月 19, 2017

cls_bpf used to take care of tracking what offload state a filter
is in, i.e. it would track if offload request succeeded or not.
This information would then be used to issue correct requests to
the driver, e.g. requests for statistics only on offloaded filters,
removing only filters which were offloaded, using add instead of
replace if previous filter was not added etc.

This tracking of offload state no longer functions with the new
callback infrastructure.  There could be multiple entities trying
to offload the same filter.

Throw out all the tracking and corresponding commands and simply
pass to the drivers both old and new bpf program.  Drivers will
have to deal with offload state tracking by themselves.

Fixes: 3f7889c4 ("net: sched: cls_bpf: call block callbacks for offload")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

102740bd

net: Fix double free and memory corruption in get_net_ns_by_id() · 21b59443

由 Eric W. Biederman 提交于 12月 19, 2017

(I can trivially verify that that idr_remove in cleanup_net happens
 after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer)                                   rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0]     ...
__put_net(peer)                                 get_net_ns_by_id(net, id)
  spin_lock(&cleanup_list_lock)
  list_add(&net->cleanup_list, &cleanup_list)
  spin_unlock(&cleanup_list_lock)
queue_work()                                      peer = idr_find(&net->netns_ids, id)
  |                                               get_net(peer) [count=1]
  |                                               ...
  |                                               (use after final put)
  v                                               ...
  cleanup_net()                                   ...
    spin_lock(&cleanup_list_lock)                 ...
    list_replace_init(&cleanup_list, ..)          ...
    spin_unlock(&cleanup_list_lock)               ...
    ...                                           ...
    ...                                           put_net(peer)
    ...                                             atomic_dec_and_test(&peer->count) [count=0]
    ...                                               spin_lock(&cleanup_list_lock)
    ...                                               list_add(&net->cleanup_list, &cleanup_list)
    ...                                               spin_unlock(&cleanup_list_lock)
    ...                                             queue_work()
    ...                                           rtnl_unlock()
    rtnl_lock()                                   ...
    for_each_net(tmp) {                           ...
      id = __peernet2id(tmp, peer)                ...
      spin_lock_irq(&tmp->nsid_lock)              ...
      idr_remove(&tmp->netns_ids, id)             ...
      ...                                         ...
      net_drop_ns()                               ...
	net_free(peer)                            ...
    }                                             ...
  |
  v
  cleanup_net()
    ...
    (Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 0c7aecd4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21b59443

Merge branch 'mvneta-fixes' · eda9873e

由 David S. Miller 提交于 12月 20, 2017

Gregory CLEMENT says:

====================
Few mvneta fixes

here it is a small series of fixes found on the mvneta driver. They
had been already used in the vendor kernel and are now ported to
mainline.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eda9873e

net: mvneta: eliminate wrong call to handle rx descriptor error · 2eecb2e0

由 Yelena Krivosheev 提交于 12月 19, 2017

There are few reasons in mvneta_rx_swbm() function when received packet
is dropped. mvneta_rx_error() should be called only if error bit [16]
is set in rx descriptor.

[gregory.clement@free-electrons.com: add fixes tag]
Cc: stable@vger.kernel.org
Fixes: dc35a10f ("net: mvneta: bm: add support for hardware buffer management")
Signed-off-by: NYelena Krivosheev <yelena@marvell.com>
Tested-by: NDmitri Epshtein <dima@marvell.com>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2eecb2e0

net: mvneta: use proper rxq_number in loop on rx queues · ca5902a6

由 Yelena Krivosheev 提交于 12月 19, 2017

When adding the RX queue association with each CPU, a typo was made in
the mvneta_cleanup_rxqs() function. This patch fixes it.

[gregory.clement@free-electrons.com: add commit log and fixes tag]
Cc: stable@vger.kernel.org
Fixes: 2dcf75e2 ("net: mvneta: Associate RX queues with each CPU")
Signed-off-by: NYelena Krivosheev <yelena@marvell.com>
Tested-by: NDmitri Epshtein <dima@marvell.com>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca5902a6

net: mvneta: clear interface link status on port disable · 4423c18e

由 Yelena Krivosheev 提交于 12月 19, 2017

When port connect to PHY in polling mode (with poll interval 1 sec),
port and phy link status must be synchronize in order don't loss link
change event.

[gregory.clement@free-electrons.com: add fixes tag]
Cc: <stable@vger.kernel.org>
Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: NYelena Krivosheev <yelena@marvell.com>
Tested-by: NDmitri Epshtein <dima@marvell.com>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4423c18e

20 12月, 2017 26 次提交

net/mlx5: Stay in polling mode when command EQ destroy fails · a2fba188

由 Moshe Shemesh 提交于 12月 04, 2017

During unload, on mlx5_stop_eqs we move command interface from events
mode to polling mode, but if command interface EQ destroy fail we move
back to events mode.
That's wrong since even if we fail to destroy command interface EQ, we
do release its irq, so no interrupts will be received.

Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

a2fba188

net/mlx5: Cleanup IRQs in case of unload failure · d6b2785c

由 Moshe Shemesh 提交于 11月 21, 2017

When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
In such failure flow the function will return without
releasing all EQs irqs and then pci_free_irq_vectors will fail.
Fix by only warn on destroy EQ failure and continue to release other
EQs and their irqs.

It fixes the following kernel trace:
kernel: kernel BUG at drivers/pci/msi.c:352!
...
...
kernel: Call Trace:
kernel: pci_disable_msix+0xd3/0x100
kernel: pci_free_irq_vectors+0xe/0x20
kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]

Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

d6b2785c

net/mlx5: Fix steering memory leak · 139ed6c6

由 Maor Gottlieb 提交于 12月 05, 2017

Flow steering priority and namespace are software only objects that
didn't have the proper destructors and were not freed during steering
cleanup.

Fix it by adding destructor functions for these objects.

Fixes: bd71b08e ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

139ed6c6

net/mlx5e: Prevent possible races in VXLAN control flow · 0c1cc8b2

由 Gal Pressman 提交于 12月 04, 2017

When calling add/remove VXLAN port, a lock must be held in order to
prevent race scenarios when more than one add/remove happens at the
same time.
Fix by holding our state_lock (mutex) as done by all other parts of the
driver.
Note that the spinlock protecting the radix-tree is still needed in
order to synchronize radix-tree access from softirq context.

Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0c1cc8b2

net/mlx5e: Add refcount to VXLAN structure · 23f4cc2c

由 Gal Pressman 提交于 12月 03, 2017

A refcount mechanism must be implemented in order to prevent unwanted
scenarios such as:
- Open an IPv4 VXLAN interface
- Open an IPv6 VXLAN interface (different socket)
- Remove one of the interfaces

With current implementation, the UDP port will be removed from our VXLAN
database and turn off the offloads for the other interface, which is
still active.
The reference count mechanism will only allow UDP port removals once all
consumers are gone.

Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

23f4cc2c

net/mlx5e: Fix possible deadlock of VXLAN lock · 63235141

由 Gal Pressman 提交于 11月 23, 2017

mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user
context) and mlx5e_features_check (softirq), but the lock acquired does
not disable bottom half and might result in deadlock. Fix it by simply
replacing spin_lock() with spin_lock_bh().
While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh().

lockdep's WARNING: inconsistent lock state
[  654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes:
[  654.028321]  (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028528] {SOFTIRQ-ON-W} state was registered at:
[  654.028607]   _raw_spin_lock+0x3c/0x70
[  654.028689]   mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028794]   mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core]
[  654.028878]   process_one_work+0x1e9/0x640
[  654.028942]   worker_thread+0x4a/0x3f0
[  654.029002]   kthread+0x141/0x180
[  654.029056]   ret_from_fork+0x24/0x30
[  654.029114] irq event stamp: 579088
[  654.029174] hardirqs last  enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0
[  654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0
[  654.029446] softirqs last  enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80
[  654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0
[  654.029684] other info that might help us debug this:
[  654.029781]  Possible unsafe locking scenario:

[  654.029868]        CPU0
[  654.029908]        ----
[  654.029947]   lock(&(&vxlan_db->lock)->rlock);
[  654.030045]   <Interrupt>
[  654.030090]     lock(&(&vxlan_db->lock)->rlock);
[  654.030162]
 *** DEADLOCK ***

Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

63235141

net/mlx5: Fix error flow in CREATE_QP command · dbff26e4

由 Moni Shoua 提交于 12月 04, 2017

In error flow, when DESTROY_QP command should be executed, the wrong
mailbox was set with data, not the one that is written to hardware,
Fix that.

Fixes: 09a7d9ec '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc'
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

dbff26e4

net/mlx5: Fix misspelling in the error message and comment · 777ec2b2

由 Eugenia Emantayev 提交于 11月 16, 2017

Fix misspelling in word syndrome.

Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

777ec2b2

net/mlx5e: Fix defaulting RX ring size when not needed · 696a97cf

由 Eugenia Emantayev 提交于 11月 14, 2017

Fixes the bug when turning on/off CQE compression mechanism
resets the RX rings size to default value when it is not
needed.

Fixes: 2fc4bfb7 ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

696a97cf

net/mlx5e: Fix features check of IPv6 traffic · 2989ad1e

由 Gal Pressman 提交于 11月 21, 2017

The assumption that the next header field contains the transport
protocol is wrong for IPv6 packets with extension headers.
Instead, we should look the inner-most next header field in the buffer.
This will fix TSO offload for tunnels over IPv6 with extension headers.

Performance testing: 19.25x improvement, cool!
Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap.
CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
TSO: Enabled
Before: 4,926.24  Mbps
Now   : 94,827.91 Mbps

Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

2989ad1e

net/mlx5e: Fix ETS BW check · ff089191

由 Huy Nguyen 提交于 10月 26, 2017

Fix bug that allows ets bw sum to be 0% when ets tc type exists.

Fixes: 08fb1dac ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Reviewed-by: NHuy Nguyen <huyn@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

ff089191

net/mlx5: Fix rate limit packet pacing naming and struct · 37e92a9d

由 Eran Ben Elisha 提交于 11月 13, 2017

In mlx5_ifc, struct size was not complete, and thus driver was sending
garbage after the last defined field. Fixed it by adding reserved field
to complete the struct size.

In addition, rename all set_rate_limit to set_pp_rate_limit to be
compliant with the Firmware <-> Driver definition.

Fixes: 7486216b ("{net,IB}/mlx5: mlx5_ifc updates")
Fixes: 1466cc5b ("net/mlx5: Rate limit tables support")
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

37e92a9d

Revert "mlx5: move affinity hints assignments to generic code" · 231243c8

由 Saeed Mahameed 提交于 11月 10, 2017

Before the offending commit, mlx5 core did the IRQ affinity itself,
and it seems that the new generic code have some drawbacks and one
of them is the lack for user ability to modify irq affinity after
the initial affinity values got assigned.

The issue is still being discussed and a solution in the new generic code
is required, until then we need to revert this patch.

This fixes the following issue:
echo <new affinity> > /proc/irq/<x>/smp_affinity
fails with  -EIO

This reverts commit a435393a.
Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
it is used in mlx5_ib driver.

Fixes: a435393a ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jes Sorensen <jsorensen@fb.com>
Reported-by: NJes Sorensen <jsorensen@fb.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

231243c8

net/mlx5: FPGA, return -EINVAL if size is zero · bae115a2

由 Kamal Heib 提交于 10月 29, 2017

Currently, if a size of zero is passed to
mlx5_fpga_mem_{read|write}_i2c()
the "err" return value will not be initialized, which triggers gcc
warnings:

[..]/mlx5/core/fpga/sdk.c:87 mlx5_fpga_mem_read_i2c() error:
uninitialized symbol 'err'.
[..]/mlx5/core/fpga/sdk.c:115 mlx5_fpga_mem_write_i2c() error:
uninitialized symbol 'err'.

fix that.

Fixes: a9956d35 ('net/mlx5: FPGA, Add SBU infrastructure')
Signed-off-by: NKamal Heib <kamalh@mellanox.com>
Reviewed-by: NYevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

bae115a2

ipv4: fib: Fix metrics match when deleting a route · d03a4557

由 Phil Sutter 提交于 12月 19, 2017

The recently added fib_metrics_match() causes a regression for routes
with both RTAX_FEATURES and RTAX_CC_ALGO if the latter has
TCP_CONG_NEEDS_ECN flag set:

| # ip link add d0 type dummy
| # ip link set d0 up
| # ip route add 172.29.29.0/24 dev d0 features ecn congctl dctcp
| # ip route del 172.29.29.0/24 dev d0 features ecn congctl dctcp
| RTNETLINK answers: No such process

During route insertion, fib_convert_metrics() detects that the given CC
algo requires ECN and hence sets DST_FEATURE_ECN_CA bit in
RTAX_FEATURES.

During route deletion though, fib_metrics_match() compares stored
RTAX_FEATURES value with that from userspace (which obviously has no
knowledge about DST_FEATURE_ECN_CA) and fails.

Fixes: 5f9ae3d9 ("ipv4: do metrics match when looking up and deleting a route")
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d03a4557

net: stmmac: Fix bad RX timestamp extraction · a1762456

由 Fredrik Hallenberg 提交于 12月 18, 2017

As noted in dwmac4_wrback_get_rx_timestamp_status the timestamp is found
in the context descriptor following the current descriptor. However the
current code looks for the context descriptor in the current
descriptor, which will always fail.
Signed-off-by: NFredrik Hallenberg <megahallon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1762456

net: stmmac: Fix TX timestamp calculation · 200922c9

由 Fredrik Hallenberg 提交于 12月 18, 2017

When using GMAC4 the value written in PTP_SSIR should be shifted however
the shifted value is also used in subsequent calculations which results
in a bad timestamp value.
Signed-off-by: NFredrik Hallenberg <megahallon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

200922c9

tipc: fix list sorting bug in function tipc_group_update_member() · 3db09601

由 Jon Maloy 提交于 12月 18, 2017

When, during a join operation, or during message transmission, a group
member needs to be added to the group's 'congested' list, we sort it
into the list in ascending order, according to its current advertised
window size. However, we miss the case when the member is already on
that list. This will have the result that the member, after the window
size has been decremented, might be at the wrong position in that list.
This again may have the effect that we during broadcast and multicast
transmissions miss the fact that a destination is not yet ready for
reception, and we end up sending anyway. From this point on, the
behavior during the remaining session is unpredictable, e.g., with
underflowing window sizes.

We now correct this bug by unconditionally removing the member from
the list before (re-)sorting it in.
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3db09601

ip6_tunnel: get the min mtu properly in ip6_tnl_xmit · c9fefa08

由 Xin Long 提交于 12月 18, 2017

Now it's using IPV6_MIN_MTU as the min mtu in ip6_tnl_xmit, but
IPV6_MIN_MTU actually only works when the inner packet is ipv6.

With IPV6_MIN_MTU for ipv4 packets, the new pmtu for inner dst
couldn't be set less than 1280. It would cause tx_err and the
packet to be dropped when the outer dst pmtu is close to 1280.

Jianlin found it by running ipv4 traffic with the topo:

  (client) gre6 <---> eth1 (route) eth2 <---> gre6 (server)

After changing eth2 mtu to 1300, the performance became very
low, or the connection was even broken. The issue also affects
ip4ip6 and ip6ip6 tunnels.

So if the inner packet is ipv4, 576 should be considered as the
min mtu.

Note that for ip4ip6 and ip6ip6 tunnels, the inner packet can
only be ipv4 or ipv6, but for gre6 tunnel, it may also be ARP.
This patch using 576 as the min mtu for non-ipv6 packet works
for all those cases.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9fefa08

ip6_gre: remove the incorrect mtu limit for ipgre tap · 2c52129a

由 Xin Long 提交于 12月 18, 2017

The same fix as the patch "ip_gre: remove the incorrect mtu limit for
ipgre tap" is also needed for ip6_gre.

Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c52129a

ip_gre: remove the incorrect mtu limit for ipgre tap · cfddd4c3

由 Xin Long 提交于 12月 18, 2017

ipgre tap driver calls ether_setup(), after commit 61e84623
("net: centralize net_device min/max MTU checking"), the range
of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.

It causes the dev mtu of the ipgre tap device to not be greater
than 1500, this limit value is not correct for ipgre tap device.

Besides, it's .change_mtu already does the right check. So this
patch is just to set max_mtu as 0, and leave the check to it's
.change_mtu.

Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfddd4c3

vxlan: update skb dst pmtu on tx path · a93bf0ff

由 Xin Long 提交于 12月 18, 2017

Unlike ip tunnels, now vxlan doesn't do any pmtu update for
upper dst pmtu, even if it doesn't match the lower dst pmtu
any more.

The problem can be reproduced when reducing the vxlan lower
dev's pmtu when running netperf. In jianlin's testing, the
performance went to 1/7 of the previous.

This patch is to update the upper dst pmtu to match the lower
dst pmtu on tx path so that packets can be sent out even when
lower dev's pmtu has been changed.

It also works for metadata dst.

Note that this patch doesn't process any pmtu icmp packet.
But even in the future, the support for pmtu icmp packets
process of udp tunnels will also needs this.

The same thing will be done for geneve in another patch.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a93bf0ff

net: arc_emac: restart stalled EMAC · 78aa0975

由 Alexander Kochetkov 提交于 12月 19, 2017

Under certain conditions EMAC stop reception of incoming packets and
continuously increment R_MISS register instead of saving data into
provided buffer. The commit implement workaround for such situation.
Then the stall detected EMAC will be restarted.

On device the stall looks like the device lost it's dynamic IP address.
ifconfig shows that interface error counter rapidly increments.
At the same time on the DHCP server we can see continues DHCP-requests
from device.

In real network stalls happen really rarely. To make them frequent the
broadcast storm[1] should be simulated. For simulation it is necessary
to make following connections:
    1. connect radxarock to 1st port of switch
    2. connect some PC to 2nd port of switch
    3. connect two other free ports together using standard ethernet cable,
       in order to make a switching loop.

After that, is necessary to make a broadcast storm. For example, running on
PC 'ping' to some IP address triggers ARP-request storm. After some
time (~10sec), EMAC on rk3188 will stall.

Observed and tested on rk3188 radxarock.

[1] https://en.wikipedia.org/wiki/Broadcast_radiationSigned-off-by: NAlexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78aa0975

net: arc_emac: fix arc_emac_rx() error paths · e688822d

由 Alexander Kochetkov 提交于 12月 15, 2017

arc_emac_rx() has some issues found by code review.

In case netdev_alloc_skb_ip_align() or dma_map_single() failure
rx fifo entry will not be returned to EMAC.

In case dma_map_single() failure previously allocated skb became
lost to driver. At the same time address of newly allocated skb
will not be provided to EMAC.
Signed-off-by: NAlexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e688822d

net: mediatek: setup proper state for disabled GMAC on the default · 7352e252

由 Sean Wang 提交于 12月 18, 2017

The current solution would setup fixed and force link of 1Gbps to the both
GMAC on the default. However, The GMAC should always be put to link down
state when the GMAC is disabled on certain target boards. Otherwise,
the driver possibly receives unexpected data from the floating hardware
connection through the unused GMAC. Although the driver had been added
certain protection in RX path to get rid of such kind of unexpected data
sent to the upper stack.
Signed-off-by: NSean Wang <sean.wang@mediatek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7352e252

mlxsw: spectrum_router: Remove batch neighbour deletion causing FW bug · 8ba6b30e

由 Petr Machata 提交于 12月 17, 2017

This reverts commit 63dd00fa.

RAUHT DELETE_ALL seems to trigger a bug in FW. That manifests by later
calls to RAUHT ADD of an IPv6 neighbor to fail with "bad parameter"
error code.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Fixes: 63dd00fa ("mlxsw: spectrum_router: Add batch neighbour deletion")
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ba6b30e

19 12月, 2017 3 次提交

tg3: Fix rx hang on MTU change with 5717/5719 · 748a240c

由 Brian King 提交于 12月 15, 2017

This fixes a hang issue seen when changing the MTU size from 1500 MTU
to 9000 MTU on both 5717 and 5719 chips. In discussion with Broadcom,
they've indicated that these chipsets have the same phy as the 57766
chipset, so the same workarounds apply. This has been tested by IBM
on both Power 8 and Power 9 systems as well as by Broadcom on x86
hardware and has been confirmed to resolve the hang issue.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

748a240c

Merge tag 'mac80211-for-davem-2017-12-19' of... · c6479d62

由 David S. Miller 提交于 12月 19, 2017

Merge tag 'mac80211-for-davem-2017-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A few more fixes:
 * hwsim:
   - set To-DS bit in some frames missing it
   - fix sleeping in atomic
 * nl80211:
   - doc cleanup
   - fix locking in an error path
 * build:
   - don't append to created certs C files
   - ship certificate pre-hexdumped
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6479d62

cfg80211: ship certificates as hex files · 04a7279f

由 Johannes Berg 提交于 12月 19, 2017

Not only does this remove the need for the hexdump code in most
normal kernel builds (still there for the extra directory), but
it also removes the need to ship binary files, which apparently
is somewhat problematic, as Randy reported.

While at it, also add the generated files to clean-files.
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

04a7279f

openanolis / cloud-kernel 12 个月 前同步成功

openanolis / cloud-kernel
12 个月前同步成功