提交 · d2f30f5172603bacaf34f0fdb021c25ad1915b05 · openanolis / cloud-kernel

25 5月, 2018 5 次提交

mlx4_core: allocate ICM memory in page size chunks · 1383cb81

由 Qing Huang 提交于 5月 23, 2018

When a system is under memory presure (high usage with fragments),
the original 256KB ICM chunk allocations will likely trigger kernel
memory management to enter slow path doing memory compact/migration
ops in order to complete high order memory allocations.

When that happens, user processes calling uverb APIs may get stuck
for more than 120s easily even though there are a lot of free pages
in smaller chunks available in the system.

Syslog:
...
Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
oracle_205573_e:205573 blocked for more than 120 seconds.
...

With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.

However in order to support smaller ICM chunk size, we need to fix
another issue in large size kcalloc allocations.

E.g.
Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
entry). So we need a 16MB allocation for a table->icm pointer array to
hold 2M pointers which can easily cause kcalloc to fail.

The solution is to use kvzalloc to replace kcalloc which will fall back
to vmalloc automatically if kmalloc fails.
Signed-off-by: NQing Huang <qing.huang@oracle.com>
Acked-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NZhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1383cb81

enic: set DMA mask to 47 bit · 322eaa06

由 Govindarajulu Varadarajan 提交于 5月 23, 2018

In commit 624dbf55 ("driver/net: enic: Try DMA 64 first, then
failover to DMA") DMA mask was changed from 40 bits to 64 bits.
Hardware actually supports only 47 bits.

Fixes: 624dbf55 ("driver/net: enic: Try DMA 64 first, then failover to DMA")
Signed-off-by: NGovindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

322eaa06

ppp: remove the PPPIOCDETACH ioctl · af8d3c7c

由 Eric Biggers 提交于 5月 23, 2018

The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
before f_count has reached 0, which is fundamentally a bad idea.  It
does check 'f_count < 2', which excludes concurrent operations on the
file since they would only be possible with a shared fd table, in which
case each fdget() would take a file reference.  However, it fails to
account for the fact that even with 'f_count == 1' the file can still be
linked into epoll instances.  As reported by syzbot, this can trivially
be used to cause a use-after-free.

Yet, the only known user of PPPIOCDETACH is pppd versions older than
ppp-2.4.2, which was released almost 15 years ago (November 2003).
Also, PPPIOCDETACH apparently stopped working reliably at around the
same time, when the f_count check was added to the kernel, e.g. see
https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
check makes PPPIOCDETACH only work in single-threaded applications; it
always fails if called from a multithreaded application.

All pppd versions released in the last 15 years just close() the file
descriptor instead.

Therefore, instead of hacking around this bug by exporting epoll
internals to modules, and probably missing other related bugs, just
remove the PPPIOCDETACH ioctl and see if anyone actually notices.  Leave
a stub in place that prints a one-time warning and returns EINVAL.

Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NEric Biggers <ebiggers@google.com>
Acked-by: NPaul Mackerras <paulus@ozlabs.org>
Reviewed-by: NGuillaume Nault <g.nault@alphalink.fr>
Tested-by: NGuillaume Nault <g.nault@alphalink.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af8d3c7c

net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands · 1dcbc01f

由 Yossi Kuperman 提交于 10月 17, 2017

Sandbox QP Commands are retired in the order they are sent. Outstanding
commands are stored in a linked-list in the order they appear. Once a
response is received and the callback gets called, we pull the first
element off the pending list, assuming they correspond.

Sending a message and adding it to the pending list is not done atomically,
hence there is an opportunity for a race between concurrent requests.

Bind both send and add under a critical section.

Fixes: bebb23e6 ("net/mlx5: Accel, Add IPSec acceleration interface")
Signed-off-by: NYossi Kuperman <yossiku@mellanox.com>
Signed-off-by: NAdi Nissim <adin@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1dcbc01f

net/mlx5e: When RXFCS is set, add FCS data into checksum calculation · 902a5459

由 Eran Ben Elisha 提交于 5月 01, 2018

When RXFCS feature is enabled, the HW do not strip the FCS data,
however it is not present in the checksum calculated by the HW.

Fix that by manually calculating the FCS checksum and adding it to the SKB
checksum field.

Add helper function to find the FCS data for all SKB forms (linear,
one fragment or more).

Fixes: 102722fc ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

902a5459

24 5月, 2018 10 次提交

net/mlx4: Fix irq-unsafe spinlock usage · d546b67c

由 Jack Morgenstein 提交于 5月 23, 2018

spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.

This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):

[  154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[  154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G          I
[  154.675856] -----------------------------------------------------
[  154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[  154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[  154.700927]
and this task is already holding:
[  154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[  154.718028] which would create a new lock dependency:
[  154.723705]  (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[  154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[  154.740798]  (&(&cq->lock)->rlock){..-.}
[  154.740800]
... which became SOFTIRQ-irq-safe at:
[  154.752163]   _raw_spin_lock_irqsave+0x3e/0x50
[  154.757163]   mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[  154.762554]   ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[  154.815603]  (&(&qp_table->lock)->rlock){+.+.}
[  154.815604]
... which became SOFTIRQ-irq-unsafe at:
[  154.827718] ...
[  154.827720]   _raw_spin_lock+0x35/0x50
[  154.833912]   mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[  154.839302]   mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]

Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.

Fixes: 6dc06c08 ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d546b67c

net: phy: broadcom: Fix bcm_write_exp() · 79fb218d

由 Florian Fainelli 提交于 5月 22, 2018

On newer PHYs, we need to select the expansion register to write with
setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
to being migrated to generic code under bcm-phy-lib.c which
unfortunately used the older implementation from the BCM54xx days.

Fix this by creating an inline stub: bcm_write_exp_sel() which adds the
correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY
and BCM7xxx PHY drivers which require setting these bits.

broadcom.c is unchanged because some PHYs even use a different selector
method, so let them specify it directly (e.g: SerDes secondary selector).

Fixes: a1cba561 ("net: phy: Add Broadcom phy library for common interfaces")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79fb218d

net: phy: broadcom: Fix auxiliary control register reads · 733a969a

由 Florian Fainelli 提交于 5月 22, 2018

We are currently doing auxiliary control register reads with the shadow
register value 0b111 (0x7) which incidentally is also the selector value
that should be present in bits [2:0]. Fix this by using the appropriate
selector mask which is defined (MII_BCM54XX_AUXCTL_SHDWSEL_MASK).

This does not have a functional impact yet because we always access the
MII_BCM54XX_AUXCTL_SHDWSEL_MISC (0x7) register in the current code.
This might change at some point though.

Fixes: 5b4e2900 ("net: phy: broadcom: add bcm54xx_auxctl_read")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

733a969a

net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message · 4f7f56b6

由 Colin Ian King 提交于 5月 22, 2018

Trivial fix to spelling mistake in mlx4_dbg debug message and also
change the phrasing of the message so that is is more readable
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f7f56b6

ibmvnic: Only do H_EOI for mobility events · 73f9d364

由 Nathan Fontenot 提交于 5月 22, 2018

When enabling the sub-CRQ IRQ a previous update sent a H_EOI prior
to the enablement to clear any pending interrupts that may be present
across a partition migration. This fixed a firmware bug where a
migration could erroneously indicate that a H_EOI was pending.

The H_EOI should only be sent when enabling during a mobility
event though. Doing so at other time could wrong and can produce
extra driver output when IRQs are enabled when doing TX completion.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73f9d364

tuntap: correctly set SOCKWQ_ASYNC_NOSPACE · 2f3ab622

由 Jason Wang 提交于 5月 22, 2018

When link is down, writes to the device might fail with
-EIO. Userspace needs an indication when the status is resolved.  As a
fix, tun_net_open() attempts to wake up writers - but that is only
effective if SOCKWQ_ASYNC_NOSPACE has been set in the past. This is
not the case of vhost_net which only poll for EPOLLOUT after it meets
errors during sendmsg().

This patch fixes this by making sure SOCKWQ_ASYNC_NOSPACE is set when
socket is not writable or device is down to guarantee EPOLLOUT will be
raised in either tun_chr_poll() or tun_sock_write_space() after device
is up.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Fixes: 1bd4978a ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f3ab622

virtio-net: fix leaking page for gso packet during mergeable XDP · 3d62b2a0

由 Jason Wang 提交于 5月 22, 2018

We need to drop refcnt to xdp_page if we see a gso packet. Otherwise
it will be leaked. Fixing this by moving the check of gso packet above
the linearizing logic. While at it, remove useless comment as well.

Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 72979a6c ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d62b2a0

virtio-net: correctly check num_buf during err path · 850e088d

由 Jason Wang 提交于 5月 22, 2018

If we successfully linearize the packet, num_buf will be set to zero
which may confuse error handling path which assumes num_buf is at
least 1 and this can lead the code tries to pop the descriptor of next
buffer. Fixing this by checking num_buf against 1 before decreasing.

Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

850e088d

virtio-net: correctly transmit XDP buff after linearizing · 5d458a13

由 Jason Wang 提交于 5月 22, 2018

We should not go for the error path after successfully transmitting a
XDP buffer after linearizing. Since the error path may try to pop and
drop next packet and increase the drop counters. Fixing this by simply
drop the refcnt of original page and go for xmit path.

Fixes: 72979a6c ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d458a13

virtio-net: correctly redirect linearized packet · 6890418b

由 Jason Wang 提交于 5月 22, 2018

After a linearized packet was redirected by XDP, we should not go for
the err path which will try to pop buffers for the next packet and
increase the drop counter. Fixing this by just drop the page refcnt
for the original page.

Fixes: 186b3c99 ("virtio-net: support XDP_REDIRECT")
Reported-by: NDavid Ahern <dsahern@gmail.com>
Tested-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6890418b

23 5月, 2018 4 次提交

pcnet32: add an error handling path in pcnet32_probe_pci() · d7db3186

由 Bo Chen 提交于 5月 21, 2018

Make sure to invoke pci_disable_device() when errors occur in
pcnet32_probe_pci().
Signed-off-by: NBo Chen <chenbo@pdx.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7db3186

qed: Fix mask for physical address in ILT entry · fdd13dd3

由 Shahed Shaikh 提交于 5月 21, 2018

ILT entry requires 12 bit right shifted physical address.
Existing mask for ILT entry of physical address i.e.
ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit
address because upper 8 bits of 64 bit address were getting
masked which resulted in completer abort error on
PCIe bus due to invalid address.

Fix that mask to handle 64bit physical address.

Fixes: fe56b9e6 ("qed: Add module with basic common support")
Signed-off-by: NShahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: NAriel Elior <ariel.elior@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdd13dd3

net: fec: Add a SPDX identifier · 1f508124

由 Fabio Estevam 提交于 5月 20, 2018

Currently there is no license information in the header of
this file.

The MODULE_LICENSE field contains ("GPL"), which means
GNU Public License v2 or later, so add a corresponding
SPDX license identifier.
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Acked-by: NFugang Duan <fugang.duan@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f508124

net: fec: ptp: Switch to SPDX identifier · 9fcca5ef

由 Fabio Estevam 提交于 5月 20, 2018

Adopt the SPDX license identifier headers to ease license compliance
management.
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Acked-by: NFugang Duan <fugang.duan@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9fcca5ef

22 5月, 2018 1 次提交

mac80211_hwsim: Fix radio dump for radio idx 0 · fed48250

由 Andrew Zaborowski 提交于 5月 22, 2018

Since 6335698e the radio with idx of 0
will not get dumped in HWSIM_CMD_GET_RADIO because of the last_idx
checks. Offset cb->args[0] by 1 similarly to what is done in nl80211.c.

Fixes: 6335698e ("mac80211_hwsim: add generation count for netlink dump operation")
Signed-off-by: NAndrew Zaborowski <andrew.zaborowski@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

fed48250

19 5月, 2018 2 次提交

cxgb4: fix offset in collecting TX rate limit info · d775f26b

由 Rahul Lakkireddy 提交于 5月 18, 2018

Correct the indirect register offsets in collecting TX rate limit info
in UP CIM logs.

Also, T5 doesn't support these indirect register offsets, so remove
them from collection logic.

Fixes: be6e36d9 ("cxgb4: collect TX rate limit info in UP CIM logs")
Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d775f26b

sh_eth: Change platform check to CONFIG_ARCH_RENESAS · b16a960d

由 Geert Uytterhoeven 提交于 5月 18, 2018

Since commit 9b5ba0df ("ARM: shmobile: Introduce ARCH_RENESAS")
is CONFIG_ARCH_RENESAS a more appropriate platform check than the legacy
CONFIG_ARCH_SHMOBILE, hence use the former.

Renesas SuperH SH-Mobile SoCs are still covered by the CONFIG_CPU_SH4
check.

This will allow to drop ARCH_SHMOBILE on ARM and ARM64 in the near
future.
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b16a960d

18 5月, 2018 3 次提交

ibmvnic: Fix statistics buffers memory leak · 07184213

由 Thomas Falcon 提交于 5月 16, 2018

Move initialization of statistics buffers from ibmvnic_init function
into ibmvnic_probe. In the current state, ibmvnic_init will be called
again during a device reset, resulting in the allocation of new
buffers without freeing the old ones.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07184213

ibmvnic: Fix non-fatal firmware error reset · 134bbe7f

由 Thomas Falcon 提交于 5月 16, 2018

It is not necessary to disable interrupt lines here during a reset
to handle a non-fatal firmware error. Move that call within the code
block that handles the other cases that do require interrupts to be
disabled and re-enabled.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

134bbe7f

ibmvnic: Free coherent DMA memory if FW map failed · 4cf2ddf3

由 Thomas Falcon 提交于 5月 16, 2018

If the firmware map fails for whatever reason, remember to free
up the memory after.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cf2ddf3

17 5月, 2018 9 次提交

tuntap: fix use after free during release · 7063efd3

由 Jason Wang 提交于 5月 16, 2018

After commit b196d88a ("tun: fix use after free for ptr_ring") we
need clean up tx ring during release(). But unfortunately, it tries to
do the cleanup blindly after socket were destroyed which will lead
another use-after-free. Fix this by doing the cleanup before dropping
the last reference of the socket in __tun_detach().
Reported-by: NAndrei Vagin <avagin@virtuozzo.com>
Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
Fixes: b196d88a ("tun: fix use after free for ptr_ring")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7063efd3

qed: Fix LL2 race during connection terminate · 490068de

由 Michal Kalderon 提交于 5月 16, 2018

Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.

This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.

Fixes: 1d6cff4f ("qed: Add iSCSI out of order packet handling")
Signed-off-by: NAriel Elior <Ariel.Elior@cavium.com>
Signed-off-by: NMichal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

490068de

qed: Fix possibility of list corruption during rmmod flows · ffd2c0d1

由 Michal Kalderon 提交于 5月 16, 2018

The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.

Fixes: 0a7fb11c ("qed: Add Light L2 support")
Signed-off-by: NAriel Elior <Ariel.Elior@cavium.com>
Signed-off-by: NMichal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffd2c0d1

qed: LL2 flush isles when connection is closed · f9bcd602

由 Michal Kalderon 提交于 5月 16, 2018

Driver should free all pending isles once it gets a FLUSH cqe from FW.
Part of iSCSI out of order flow.

Fixes: 1d6cff4f ("qed: Add iSCSI out of order packet handling")
Signed-off-by: NAriel Elior <Ariel.Elior@cavium.com>
Signed-off-by: NMichal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9bcd602

net: 8390: ne: Fix accidentally removed RBTX4927 support · e49ac967

由 Geert Uytterhoeven 提交于 5月 16, 2018

The configuration settings for RBTX4927 were accidentally removed,
leading to a silently broken network interface.

Re-add the missing settings to fix this.

Fixes: 8eb97ff5 ("net: 8390: remove m32r specific bits")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e49ac967

net: dsa: bcm_sf2: Fix IPv6 rule half deletion · 1942adf6

由 Florian Fainelli 提交于 5月 15, 2018

It was possible to delete only one half of an IPv6, which would leave
the second half still programmed and possibly in use. Instead of
checking for the unused bitmap, we need to check the unique bitmap, and
refuse any deletion that does not match that criteria. We also need to
move that check from bcm_sf2_cfp_rule_del_one() into its caller:
bcm_sf2_cfp_rule_del() otherwise we would not be able to delete second
halves anymore that would not pass the first test.

Fixes: ba0696c2 ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1942adf6

net: dsa: bcm_sf2: Fix IPv6 rules and chain ID · 6c05561c

由 Florian Fainelli 提交于 5月 15, 2018

We had several issues that would make the programming of IPv6 rules both
inconsistent and error prone:

- the chain ID that we would be asking the hardware to put in the
  packet's Broadcom tag would be off by one, it would return one of the
  two indexes, but not the one user-space specified

- when an user specified a particular location to insert a CFP rule at,
  we would not be returning the same index, which would be confusing if
  nothing else

- finally, like IPv4, it would be possible to overflow the last entry by
  re-programming it

Fix this by swapping the usage of rule_index[0] and rule_index[1] where
relevant in order to return a consistent and correct user-space
experience.

Fixes: ba0696c2 ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c05561c

net: dsa: bcm_sf2: Fix RX_CLS_LOC_ANY overwrite for last rule · 43a5e00f

由 Florian Fainelli 提交于 5月 15, 2018

When we let the kernel pick up a rule location with RX_CLS_LOC_ANY, we
would be able to overwrite the last rules because of a number of issues.

The IPv4 code path would not be checking that rule_index is within
bounds, and it would also only be allowed to pick up rules from range
0..126 instead of the full 0..127 range. This would lead us to allow
overwriting the last rule when we let the kernel pick-up the location.

Fixes: 33061458 ("net: dsa: bcm_sf2: Move IPv4 CFP processing to specific functions")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43a5e00f

net: phy: micrel: add 125MHz reference clock workaround · e1b505a6

由 Markus Niebel 提交于 5月 15, 2018

The micrel KSZ9031 phy has a optional clock pin (CLK125_NDO) which can be
used as reference clock for the MAC unit. The clock signal must meet the
RGMII requirements to ensure the correct data transmission between the
MAC and the PHY. The KSZ9031 phy does not fulfill the duty cycle
requirement if the phy is configured as slave. For a complete
describtion look at the errata sheets: DS80000691D or DS80000692D.

The errata sheet recommends to force the phy into master mode whenever
there is a 1000Base-T link-up as work around. Only set the
"micrel,force-master" property if you use the phy reference clock provided
by CLK125_NDO pin as MAC reference clock in your application.

Attenation, this workaround is only usable if the link partner can
be configured to slave mode for 1000Base-T.
Signed-off-by: NMarkus Niebel <Markus.Niebel@tqs.de>
[m.felsch@pengutronix.de: fix dt-binding documentation]
[m.felsch@pengutronix.de: use already existing result var for read/write]
[m.felsch@pengutronix.de: add error handling]
[m.felsch@pengutronix.de: add more comments]
Signed-off-by: NMarco Felsch <m.felsch@pengutronix.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1b505a6

16 5月, 2018 1 次提交

ipvlan: call netdevice notifier when master mac address changed · ab452c3c

由 Keefe Liu 提交于 5月 14, 2018

When master device's mac has been changed, the commit
32c10bbf ("ipvlan: always use the current L2 addr of the
master") makes the IPVlan devices's mac changed also, but it
doesn't do related works such as flush the IPVlan devices's
arp table.
Signed-off-by: NKeefe Liu <liuqifa@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab452c3c

15 5月, 2018 5 次提交

vmxnet3: use DMA memory barriers where required · f3002c13

由 hpreg@vmware.com 提交于 5月 14, 2018

The gen bits must be read first from (resp. written last to) DMA memory.
The proper way to enforce this on Linux is to call dma_rmb() (resp.
dma_wmb()).
Signed-off-by: NRegis Duchesne <hpreg@vmware.com>
Acked-by: NRonak Doshi <doshir@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3002c13

vmxnet3: set the DMA mask before the first DMA map operation · 61aeecea

由 hpreg@vmware.com 提交于 5月 14, 2018

The DMA mask must be set before, not after, the first DMA map operation, or
the first DMA map operation could in theory fail on some systems.

Fixes: b0eb57cb ("VMXNET3: Add support for virtual IOMMU")
Signed-off-by: NRegis Duchesne <hpreg@vmware.com>
Acked-by: NRonak Doshi <doshir@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61aeecea

cxgb4: Correct ntuple mask validation for hash filters · 849a742c

由 Kumar Sanghvi 提交于 5月 14, 2018

Earlier code of doing bitwise AND with field width bits was wrong.
Instead, simplify code to calculate ntuple_mask based on supplied
fields and then compare with mask configured in hw - which is the
correct and simpler way to validate ntuple mask.

Fixes: 3eb8b62d ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

849a742c

net/mlx4_core: Fix error handling in mlx4_init_port_info. · 57f6f99f

由 Tarick Bedeir 提交于 5月 13, 2018

Avoid exiting the function with a lingering sysfs file (if the first
call to device_create_file() fails while the second succeeds), and avoid
calling devlink_port_unregister() twice.

In other words, either mlx4_init_port_info() succeeds and returns zero, or
it fails, returns non-zero, and requires no cleanup.

Fixes: 096335b3 ("mlx4_core: Allow dynamic MTU configuration for IB ports")
Signed-off-by: NTarick Bedeir <tarick@google.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57f6f99f

tun: fix use after free for ptr_ring · b196d88a

由 Jason Wang 提交于 5月 11, 2018

We used to initialize ptr_ring during TUNSETIFF, this is because its
size depends on the tx_queue_len of netdevice. And we try to clean it
up when socket were detached from netdevice. A race were spotted when
trying to do uninit during a read which will lead a use after free for
pointer ring. Solving this by always initialize a zero size ptr_ring
in open() and do resizing during TUNSETIFF, and then we can safely do
cleanup during close(). With this, there's no need for the workaround
that was introduced by commit 4df0bfc7 ("tun: fix a memory leak
for tfile->tx_array").

Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes: 1576d986 ("tun: switch to use skb array for tx")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b196d88a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功