提交 · 5cd8d46ea1562be80063f53c7c6a5f40224de623 · openeuler / Kernel

24 11月, 2018 1 次提交

packet: copy user buffers before orphan or clone · 5cd8d46e

由 Willem de Bruijn 提交于 11月 20, 2018

tpacket_snd sends packets with user pages linked into skb frags. It
notifies that pages can be reused when the skb is released by setting
skb->destructor to tpacket_destruct_skb.

This can cause data corruption if the skb is orphaned (e.g., on
transmit through veth) or cloned (e.g., on mirror to another psock).

Create a kernel-private copy of data in these cases, same as tun/tap
zerocopy transmission. Reuse that infrastructure: mark the skb as
SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).

Unlike other zerocopy packets, do not set shinfo destructor_arg to
struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
when the original skb is released and a timestamp is recorded. Do not
change this timestamp behavior. The ubuf_info->callback is not needed
anyway, as no zerocopy notification is expected.

Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
resulting value is not a valid ubuf_info pointer, nor a valid
tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.

The fix relies on features introduced in commit 52267790 ("sock:
add MSG_ZEROCOPY"), so can be backported as is only to 4.14.

Tested with from `./in_netns.sh ./txring_overwrite` from
http://github.com/wdebruij/kerneltools/tests

Fixes: 69e3c75f ("net: TX_RING and packet mmap")
Reported-by: NAnand H. Krishnan <anandhkrishnan@gmail.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cd8d46e

23 11月, 2018 5 次提交

Merge branch 'ibmvnic-Fix-queue-and-buffer-accounting-errors' · 039e70a7

由 David S. Miller 提交于 11月 22, 2018

Thomas Falcon says:

====================
ibmvnic: Fix queue and buffer accounting errors

This series includes two small fixes. The first resolves a typo bug
in the code to clean up unused RX buffers during device queue removal.
The second ensures that device queue memory is updated to reflect new
supported queue ring sizes after migration to other backing hardware.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

039e70a7

ibmvnic: Update driver queues after change in ring size support · 5bf032ef

由 Thomas Falcon 提交于 11月 21, 2018

During device reset, queue memory is not being updated to accommodate
changes in ring buffer sizes supported by backing hardware. Track
any differences in ring buffer sizes following the reset and update
queue memory when possible.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5bf032ef

ibmvnic: Fix RX queue buffer cleanup · b7cdec3d

由 Thomas Falcon 提交于 11月 21, 2018

The wrong index is used when cleaning up RX buffer objects during release
of RX queues. Update to use the correct index counter.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7cdec3d

net: thunderx: set xdp_prog to NULL if bpf_prog_add fails · 6d0f60b0

由 Lorenzo Bianconi 提交于 11月 21, 2018

Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine
reports the error code instead of NULL in case of failure and xdp_prog
pointer value is used in the driver to verify if XDP is currently
enabled.
Moreover report the error code to userspace if nicvf_xdp_setup fails

Fixes: 05c773f5 ("net: thunderx: Add basic XDP support")
Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d0f60b0

net/dim: Update DIM start sample after each DIM iteration · 0211dda6

由 Tal Gilboa 提交于 11月 21, 2018

On every iteration of net_dim, the algorithm may choose to
check for the system state by comparing current data sample
with previous data sample. After each of these comparison,
regardless of the action taken, the sample used as baseline
is needed to be updated.

This patch fixes a bug that causes DIM to take wrong decisions,
due to never updating the baseline sample for comparison between
iterations. This way, DIM always compares current sample with
zeros.

Although this is a functional fix, it also improves and stabilizes
performance as the algorithm works properly now.

Performance:
Tested single UDP TX stream with pktgen:
samples/pktgen/pktgen_sample03_burst_single_flow.sh -i p4p2 -d 1.1.1.1
-m 24:8a:07:88:26:8b -f 3 -b 128

ConnectX-5 100GbE packet rate improved from 15-19Mpps to 19-20Mpps.
Also, toggling between profiles is less frequent with the fix.

Fixes: 8115b750 ("net/dim: use struct net_dim_sample as arg to net_dim")
Signed-off-by: NTal Gilboa <talgi@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0211dda6

22 11月, 2018 10 次提交

net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts · 426a593e

由 Vincent Chen 提交于 11月 21, 2018

In the original ftmac100_interrupt(), the interrupts are only disabled when
the condition "netif_running(netdev)" is true. However, this condition
causes kerenl hang in the following case. When the user requests to
disable the network device, kernel will clear the bit __LINK_STATE_START
from the dev->state and then call the driver's ndo_stop function. Network
device interrupts are not blocked during this process. If an interrupt
occurs between clearing __LINK_STATE_START and stopping network device,
kernel cannot disable the interrupts due to the condition
"netif_running(netdev)" in the ISR. Hence, kernel will hang due to the
continuous interruption of the network device.

In order to solve the above problem, the interrupts of the network device
should always be disabled in the ISR without being restricted by the
condition "netif_running(netdev)".

[V2]
Remove unnecessary curly braces.
Signed-off-by: NVincent Chen <vincentc@andestech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

426a593e

Merge branch 'smc-fixes' · 395048eb

由 David S. Miller 提交于 11月 21, 2018

Ursula Braun says:

====================
net/smc: fixes 2018-11-12

here is V4 of some net/smc fixes in different areas for the net tree.

v1->v2:
   do not define 8-byte alignment for union smcd_cdc_cursor in
   patch 4/5 "net/smc: atomic SMCD cursor handling"
v2->v3:
   stay with 8-byte alignment for union smcd_cdc_cursor in
   patch 4/5 "net/smc: atomic SMCD cursor handling", but get rid of
   __packed for struct smcd_cdc_msg
v3->v4:
   get rid of another __packed for struct smc_cdc_msg in
   patch 4/5 "net/smc: atomic SMCD cursor handling"
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

395048eb

net/smc: use after free fix in smc_wr_tx_put_slot() · e438bae4

由 Ursula Braun 提交于 11月 20, 2018

In smc_wr_tx_put_slot() field pend->idx is used after being
cleared. That means always idx 0 is cleared in the wr_tx_mask.
This results in a broken administration of available WR send
payload buffers.
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e438bae4

net/smc: atomic SMCD cursor handling · b9a22dd9

由 Ursula Braun 提交于 11月 20, 2018

Running uperf tests with SMCD on LPARs results in corrupted cursors.
SMCD cursors should be treated atomically to fix cursor corruption.
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9a22dd9

net/smc: add SMC-D shutdown signal · 0512f69e

由 Hans Wippel 提交于 11月 20, 2018

When a SMC-D link group is freed, a shutdown signal should be sent to
the peer to indicate that the link group is invalid. This patch adds the
shutdown signal to the SMC code.
Signed-off-by: NHans Wippel <hwippel@linux.ibm.com>
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0512f69e

net/smc: use queue pair number when matching link group · ee05ff7a

由 Karsten Graul 提交于 11月 20, 2018

When searching for an existing link group the queue pair number is also
to be taken into consideration. When the SMC server sends a new number
in a CLC packet (keeping all other values equal) then a new link group
is to be created on the SMC client side.
Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee05ff7a

net/smc: abort CLC connection in smc_release · f07920ad

由 Hans Wippel 提交于 11月 20, 2018

In case of a non-blocking SMC socket, the initial CLC handshake is
performed over a blocking TCP connection in a worker. If the SMC socket
is released, smc_release has to wait for the blocking CLC socket
operations (e.g., kernel_connect) inside the worker.

This patch aborts a CLC connection when the respective non-blocking SMC
socket is released to avoid waiting on socket operations or timeouts.
Signed-off-by: NHans Wippel <hwippel@linux.ibm.com>
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f07920ad

Merge tag 'wireless-drivers-for-davem-2018-11-20' of... · 1e2b1046

由 David S. Miller 提交于 11月 21, 2018

Merge tag 'wireless-drivers-for-davem-2018-11-20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.20

First set of fixes for 4.20, this time we have quite a few them but
all very small.

ath9k

* fix a locking regression found by a static checker

wlcore

* fix a crash which was a regression with wakeirq handling

brcm80211

* yet another fix for 160 MHz channel handling

mt76

* fix a longstaning build problem when CONFIG_LEDS_CLASS is disabled

* don't use uninitialised mutex

iwlwifi

* do note that the iwlwifi merge tag (commit 4ec321c1) seems to
  contain wrong list of changes so ignore that

* fix ACPI data handling, a memory leak and other smaller fixes

ath10k

* fix a crash during suspend which was a recent regression
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e2b1046

tcp: defer SACK compression after DupThresh · 86de5921

由 Eric Dumazet 提交于 11月 20, 2018

Jean-Louis reported a TCP regression and bisected to recent SACK
compression.

After a loss episode (receiver not able to keep up and dropping
packets because its backlog is full), linux TCP stack is sending
a single SACK (DUPACK).

Sender waits a full RTO timer before recovering losses.

While RFC 6675 says in section 5, "Algorithm Details",

   (2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true --
       indicating at least three segments have arrived above the current
       cumulative acknowledgment point, which is taken to indicate loss
       -- go to step (4).
...
   (4) Invoke fast retransmit and enter loss recovery as follows:

there are old TCP stacks not implementing this strategy, and
still counting the dupacks before starting fast retransmit.

While these stacks probably perform poorly when receivers implement
LRO/GRO, we should be a little more gentle to them.

This patch makes sure we do not enable SACK compression unless
3 dupacks have been sent since last rcv_nxt update.

Ideally we should even rearm the timer to send one or two
more DUPACK if no more packets are coming, but that will
be work aiming for linux-4.21.

Many thanks to Jean-Louis for bisecting the issue, providing
packet captures and testing this patch.

Fixes: 5d9f4262 ("tcp: add SACK compression")
Reported-by: NJean-Louis Dupond <jean-louis@dupond.be>
Tested-by: NJean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86de5921

net: skb_scrub_packet(): Scrub offload_fwd_mark · b5dd186d

由 Petr Machata 提交于 11月 20, 2018

When a packet is trapped and the corresponding SKB marked as
already-forwarded, it retains this marking even after it is forwarded
across veth links into another bridge. There, since it ingresses the
bridge over veth, which doesn't have offload_fwd_mark, it triggers a
warning in nbp_switchdev_frame_mark().

Then nbp_switchdev_allowed_egress() decides not to allow egress from
this bridge through another veth, because the SKB is already marked, and
the mark (of 0) of course matches. Thus the packet is incorrectly
blocked.

Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That
function is called from tunnels and also from veth, and thus catches the
cases where traffic is forwarded between bridges and transformed in a
way that invalidates the marking.

Fixes: 6bc506b4 ("bridge: switchdev: Add forward mark support for stacked devices")
Fixes: abf4bb6b ("skbuff: Add the offload_mr_fwd_mark field")
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Suggested-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5dd186d

21 11月, 2018 4 次提交

net/sched: act_police: fix race condition on state variables · f2cbd485

由 Davide Caratti 提交于 11月 20, 2018

after 'police' configuration parameters were converted to use RCU instead
of spinlock, the state variables used to compute the traffic rate (namely
'tcfp_toks', 'tcfp_ptoks' and 'tcfp_t_c') are erroneously read/updated in
the traffic path without any protection.

Use a dedicated spinlock to avoid race conditions on these variables, and
ensure proper cache-line alignment. In this way, 'police' is still faster
than what we observed when 'tcf_lock' was used in the traffic path _ i.e.
reverting commit 2d550dba ("net/sched: act_police: don't use spinlock
in the data path"). Moreover, we preserve the throughput improvement that
was obtained after 'police' started using per-cpu counters, when 'avrate'
is used instead of 'rate'.

Changes since v1 (thanks to Eric Dumazet):
- call ktime_get_ns() before acquiring the lock in the traffic path
- use a dedicated spinlock instead of tcf_lock
- improve cache-line usage

Fixes: 2d550dba ("net/sched: act_police: don't use spinlock in the data path")
Reported-and-suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>

f2cbd485

MAINTAINERS: add myself as co-maintainer for r8169 · b1d98233

由 Heiner Kallweit 提交于 11月 20, 2018

Meanwhile I know the driver quite well and I refactored bigger parts
of it. As a result people contact me already with r8169 questions.
Therefore I'd volunteer to become co-maintainer of the driver also
officially.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1d98233

tcp: Fix SOF_TIMESTAMPING_RX_HARDWARE to use the latest timestamp during TCP coalescing · cadf9df2

由 Stephen Mallon 提交于 11月 20, 2018

During tcp coalescing ensure that the skb hardware timestamp refers to the
highest sequence number data.
Previously only the software timestamp was updated during coalescing.
Signed-off-by: NStephen Mallon <stephen.mallon@sydney.edu.au>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cadf9df2

tg3: Add PHY reset for 5717/5719/5720 in change ring and flow control paths · 59663e42

由 Siva Reddy Kallam 提交于 11月 20, 2018

This patch has the fix to avoid PHY lockup with 5717/5719/5720 in change
ring and flow control paths. This patch solves the RX hang while doing
continuous ring or flow control parameters with heavy traffic from peer.
Signed-off-by: NSiva Reddy Kallam <siva.kallam@broadcom.com>
Acked-by: NMichael Chan <michael.chan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59663e42

20 11月, 2018 20 次提交

Merge tag 'mlx5-fixes-2018-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 1359f251

由 David S. Miller 提交于 11月 19, 2018

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-11-19

The following fixes are for mlx5 core and netdev driver.

For -stable v4.16
bc7fda7d4637 ('net/mlx5e: IPoIB, Reset QP after channels are closed')

For -stable v4.17
36917a270395 ('net/mlx5: IPSec, Fix the SA context hash key')

For -stable v4.18
6492a432be3a ('net/mlx5e: Always use the match level enum when parsing TC rule match')
c3f81be236b1 ('net/mlx5e: Removed unnecessary warnings in FEC caps query')
c5ce2e736b64 ('net/mlx5e: Fix selftest for small MTUs')

For -stable v4.19
effcd896b25e ('net/mlx5e: Adjust to max number of channles when re-attaching')
394cbc5acd68 ('net/mlx5e: RX, verify received packet size in Linear Striding RQ')
447cbb3613c8 ('net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded')
c223c1574612 ('net/mlx5e: Claim TC hw offloads support only under a proper build config')

Please pull and let me know if there's any problem.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1359f251

net/ibmnvic: Fix deadlock problem in reset · a5681e20

由 Juliet Kim 提交于 11月 19, 2018

This patch changes to use rtnl_lock only during a reset to avoid
deadlock that could occur when a thread operating close is holding
rtnl_lock and waiting for reset_lock acquired by another thread,
which is waiting for rtnl_lock in order to set the number of tx/rx
queues during a reset.

Also, we now setting the number of tx/rx queues during a soft reset
for failover or LPM events.
Signed-off-by: NJuliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5681e20

Merge branch 'qed-Fix-Queue-Manager-getters' · db9a0bae

由 David S. Miller 提交于 11月 19, 2018

Denis Bolotin says:

====================
qed: Fix Queue Manager getters

This patch series fixes various queue manager getter functions. It is
important to make sure the getter's caller will receive a valid queue even
in error case to prevent more serious bugs.
Please consider applying to net.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db9a0bae

qed: Fix QM getters to always return a valid pq · eb62cca9

由 Denis Bolotin 提交于 11月 19, 2018

The getter callers doesn't know the valid Physical Queues (PQ) values.
This patch makes sure that a valid PQ will always be returned.

The patch consists of 3 fixes:

 - When qed_init_qm_get_idx_from_flags() receives a disabled flag, it
   returned PQ 0, which can potentially be another function's pq. Verify
   that flag is enabled, otherwise return default start_pq.

 - When qed_init_qm_get_idx_from_flags() receives an unknown flag, it
   returned NULL and could lead to a segmentation fault. Return default
   start_pq instead.

 - A modulo operation was added to MCOS/VFS PQ getters to make sure the
   PQ returned is in range of the required flag.

Fixes: b5a9ee7c ("qed: Revise QM cofiguration")
Signed-off-by: NDenis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb62cca9

qed: Fix bitmap_weight() check · 276d43f0

由 Denis Bolotin 提交于 11月 19, 2018

Fix the condition which verifies that only one flag is set. The API
bitmap_weight() should receive size in bits instead of bytes.

Fixes: b5a9ee7c ("qed: Revise QM cofiguration")
Signed-off-by: NDenis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

276d43f0

net/mlx5e: Fix failing ethtool query on FEC query error · 9184e51b

由 Shay Agroskin 提交于 11月 08, 2018

If FEC caps query fails when executing 'ethtool <interface>'
the whole callback fails unnecessarily, fixed that by replacing the
error return code with debug logging only.

Fixes: 6cfa9460 ("net/mlx5e: Ethtool driver callback for query/set FEC policy")
Signed-off-by: NShay Agroskin <shayag@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

9184e51b

net/mlx5e: Removed unnecessary warnings in FEC caps query · 64e28334

由 Shay Agroskin 提交于 10月 28, 2018

Querying interface FEC caps with 'ethtool [int]' after link reset
throws warning regading link speed.
This warning is not needed as there is already an indication in
user space that the link is not up.

Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: NShay Agroskin <shayag@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

64e28334

net/mlx5e: Fix wrong field name in FEC related functions · febd72f2

由 Shay Agroskin 提交于 10月 28, 2018

This bug would result in reading wrong FEC capabilities for 10G/40G.

Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
Signed-off-by: NShay Agroskin <shayag@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

febd72f2

net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds · 9cdeaab3

由 Shay Agroskin 提交于 10月 28, 2018

Some speeds don't support turning FEC policy off. In case a requested
FEC policy is not supported for a speed (including current speed), its new
FEC policy would be:
	no FEC - if disabling FEC is supported for that speed
	unchanged - else

Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
Signed-off-by: NShay Agroskin <shayag@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

9cdeaab3

Merge branch 'ena-hibernation-and-rmmod-bug-fixes' · d7c60210

由 David S. Miller 提交于 11月 19, 2018

Arthur Kiyanovski says:

====================
net: ena: hibernation and rmmod bug fixes

This patchset includes 2 bug fixes:
1. A fix to a crash during resume from hibernation.
2. A fix to an illegal memory access during driver removal (e.g. during rmmod)
   which might cause a crash in certain systems.

The subminor number in the driver version is also promoted to indicate driver
was changed.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7c60210

net: ena: update driver version from 2.0.1 to 2.0.2 · 4c23738a

由 Arthur Kiyanovski 提交于 11月 19, 2018

Update driver version due to critical bug fixes.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c23738a

net: ena: fix crash during ena_remove() · 58a54b9c

由 Arthur Kiyanovski 提交于 11月 19, 2018

In ena_remove() we have the following stack call:
ena_remove()
  unregister_netdev()
  ena_destroy_device()
    netif_carrier_off()

Calling netif_carrier_off() causes linkwatch to try to handle the
link change event on the already unregistered netdev, which leads
to a read from an unreadable memory address.

This patch switches the order of the two functions, so that
netif_carrier_off() is called on a regiestered netdev.

To accomplish this fix we also had to:
1. Remove the set bit ENA_FLAG_TRIGGER_RESET
2. Add a sanitiy check in ena_close()
both to prevent double device reset (when calling unregister_netdev()
ena_close is called, but the device was already deleted in
ena_destroy_device()).
3. Set the admin_queue running state to false to avoid using it after
device was reset (for example when calling ena_destroy_all_io_queues()
right after ena_com_dev_reset() in ena_down)

Fixes: 944b28aa ("net: ena: fix missing lock during device destruction")
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58a54b9c

net: ena: fix crash during failed resume from hibernation · e76ad21d

由 Arthur Kiyanovski 提交于 11月 19, 2018

During resume from hibernation if ena_restore_device fails,
ena_com_dev_reset() is called, and uses the readless read mechanism,
which was already destroyed by the call to
ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer
reference.

In this commit we switch the call order of the above two functions
to avoid this crash.

Fixes: d7703ddb ("net: ena: fix rare bug when failed restart/resume is followed by driver removal")
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e76ad21d

sctp: not increase stream's incnt before sending addstrm_in request · e1e46479

由 Xin Long 提交于 11月 18, 2018

Different from processing the addstrm_out request, The receiver handles
an addstrm_in request by sending back an addstrm_out request to the
sender who will increase its stream's in and incnt later.

Now stream->incnt has been increased since it sent out the addstrm_in
request in sctp_send_add_streams(), with the wrong stream->incnt will
even cause crash when copying stream info from the old stream's in to
the new one's in sctp_process_strreset_addstrm_out().

This patch is to fix it by simply removing the stream->incnt change
from sctp_send_add_streams().

Fixes: 242bd2d5 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter")
Reported-by: NJianwen Ji <jiji@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1e46479

net/mlx5e: Fix selftest for small MTUs · 228c4cd0

由 Valentine Fatiev 提交于 10月 17, 2018

Loopback test had fixed packet size, which can be bigger than configured
MTU. Shorten the loopback packet size to be bigger than minimal MTU
allowed by the device. Text field removed from struct 'mlx5ehdr'
as redundant to allow send small packets as minimal allowed MTU.

Fixes: d605d668 ("net/mlx5e: Add support for ethtool self diagnostics test")
Signed-off-by: NValentine Fatiev <valentinef@mellanox.com>
Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

228c4cd0

net/mlx5e: RX, verify received packet size in Linear Striding RQ · 0073c8f7

由 Moshe Shemesh 提交于 10月 11, 2018

In case of striding RQ, we use  MPWRQ (Multi Packet WQE RQ), which means
that WQE (RX descriptor) can be used for many packets and so the WQE is
much bigger than MTU.  In virtualization setups where the port mtu can
be larger than the vf mtu, if received packet is bigger than MTU, it
won't be dropped by HW on too small receive WQE. If we use linear SKB in
striding RQ, since each stride has room for mtu size payload and skb
info, an oversized packet can lead to crash for crossing allocated page
boundary upon the call to build_skb. So driver needs to check packet
size and drop it.

Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the
number of packets dropped by the driver for being too large.

As a new field is added to the RQ struct, re-open the channels whenever
this field is being used in datapath (i.e., in the case of linear
Striding RQ).

Fixes: 619a8f2a ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0073c8f7

net/mlx5e: Apply the correct check for supporting TC esw rules split · 1392f44b

由 Roi Dayan 提交于 10月 23, 2018

The mirror and not the output count is the one denoting a split.
Fix to condition the offload attempt on the mirror count being > 0
along the firmware to have the related capability.

Fixes: 592d3651 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows")
Signed-off-by: NRoi Dayan <roid@mellanox.com>
Reviewed-by: NYossi Kuperman <yossiku@mellanox.com>
Reviewed-by: NChris Mi <chrism@mellanox.com>
Acked-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1392f44b

net/mlx5e: Adjust to max number of channles when re-attaching · a1f240f1

由 Yuval Avnery 提交于 10月 16, 2018

When core driver enters deattach/attach flow after pci reset,
Number of logical CPUs may have changed.
As a result we need to update the cpu affiliated resource tables.
	1. indirect rqt list
	2. eq table

Reproduction (PowerPC):
	echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes
	ppc64_cpu --smt=on
	# Restart driver
	modprobe -r ... ; modprobe ...
	# Link up
	ifconfig ...
	# Only physical CPUs
	ppc64_cpu --smt=off
	# Inject PCI errors so PCI will reset - calling the pci error handler
	echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA

Call trace when trying to add non-existing rqs to an indirect rqt:
	mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable)
	mlx5e_redirect_rqts+0x188/0x190 [mlx5_core]
	mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core]
	mlx5e_open_locked+0xbc/0x140 [mlx5_core]
	mlx5e_open+0x50/0x130 [mlx5_core]
	mlx5e_nic_enable+0x174/0x1b0 [mlx5_core]
	mlx5e_attach_netdev+0x154/0x290 [mlx5_core]
	mlx5e_attach+0x88/0xd0 [mlx5_core]
	mlx5_attach_device+0x168/0x1e0 [mlx5_core]
	mlx5_load_one+0x1140/0x1210 [mlx5_core]
	mlx5_pci_resume+0x6c/0xf0 [mlx5_core]

Create cq will fail when trying to use non-existing EQ.

Fixes: 89d44f0a ("net/mlx5_core: Add pci error handlers to mlx5_core driver")
Signed-off-by: NYuval Avnery <yuvalav@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

a1f240f1

net/mlx5e: Always use the match level enum when parsing TC rule match · 83621b7d

由 Or Gerlitz 提交于 10月 28, 2018

We get the match level (none, l2, l3, l4) while going over the match
dissectors of an offloaded tc rule. When doing this, the match level
enum and the not min inline enum values should be used, fix that.

This worked accidentally b/c both enums have the same numerical values.

Fixes: d708f902 ('net/mlx5e: Get the required HW match level while parsing TC flow matches')
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

83621b7d

net/mlx5e: Claim TC hw offloads support only under a proper build config · 077ecd78

由 Or Gerlitz 提交于 10月 18, 2018

Currently, we are only supporting tc hw offloads when the eswitch
support is compiled in, but we are not gating the adevertizment
of the NETIF_F_HW_TC feature on this config being set.

Fix it, and while doing that, also avoid dealing with the feature
on ethtool when the config is not set.

Fixes: e8f887ac ('net/mlx5e: Introduce tc offload support')
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

077ecd78

openeuler / Kernel 10 个月 前同步成功

openeuler / Kernel
10 个月前同步成功