提交 · c3e53b9a3efe300a7864ab1ccfbae239d50d0002 · openanolis / cloud-kernel

16 6月, 2017 3 次提交

ibmvnic: Activate disabled RX buffer pools on reset · c3e53b9a

由 Thomas Falcon 提交于 6月 14, 2017

RX buffer pools are disabled while awaiting a device
reset if firmware indicates that the resource is closed.

This patch fixes a bug where pools were not being
subsequently enabled after the device reset, causing
the device to become inoperable.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3e53b9a

sunvnet: restrict advertized checksum offloads to just IP · 7e9191c5

由 Shannon Nelson 提交于 6月 14, 2017

As much as we'd like to play well with others, we really aren't
handling the checksums on non-IP protocol packets very well.  This
is easily seen when trying to do TCP over ipv6 - the checksums are
garbage.

Here we restrict the checksum feature flag to just IP traffic so
that we aren't given work we can't yet do.

Orabug: 26175391, 26259755
Signed-off-by: NShannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e9191c5

net: s2io: remove useless variable in fill_rx_buffers · 9d7cdedd

由 Gustavo A. R. Silva 提交于 6月 14, 2017

Remove useless variable rxd_index and code related.

Addresses-Coverity-ID: 1397691
Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d7cdedd

15 6月, 2017 6 次提交

i40e: Fix a sleep-in-atomic bug · 640f93cc

由 Jia-Ju Bai 提交于 6月 14, 2017

The driver may sleep under a spin lock, and the function call path is:
i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh)
  i40e_vsi_remove_pvid
    i40e_vlan_stripping_disable
      i40e_aq_update_vsi_params
        i40e_asq_send_command
          mutex_lock --> may sleep

To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and
the lock is acquired again after this function.
Signed-off-by: NJia-Ju Bai <baijiaju1990@163.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

640f93cc

liquidio: fix VF driver off-by-one bug when setting ethtool -C ethX rx-frames · 0430a260

由 Weilin Chang 提交于 6月 14, 2017

Signed-off-by: NWeilin Chang <weilin.chang@cavium.com>
Signed-off-by: NDerek Chickles <derek.chickles@cavium.com>
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0430a260

net/mlxfw: fix a NULL dereference · f5165a54

由 Dan Carpenter 提交于 6月 14, 2017

If we hit this error path we end up returning ERR_PTR(0) which is NULL.
The caller is not expecting that so it results in a NULL dereference.

Fixes: 410ed13c ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NYotam Gigi <yotamg@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5165a54

qed: Fix an off by one bug · 0331402a

由 Dan Carpenter 提交于 6月 14, 2017

The p_l2_info->pp_qid_usage[] array has "p_l2_info->queues" elements so
the > here should be a >= or we write beyond the end of the array.

Fixes: bbe3f233 ("qed: Assign a unique per-queue index to queue-cid")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0331402a

mlxsw: spectrum: Add support for access cable info via ethtool · 2ea10903

由 Arkadi Sharshevsky 提交于 6月 14, 2017

Add support for access cable info via ethtool.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ea10903

mlxsw: reg: Add MCIA register for cable info access · 7ca36994

由 Arkadi Sharshevsky 提交于 6月 14, 2017

The MCIA register is used to access the SFP+ and QSFP connector's
EPROM. It will be used to query the cable info.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ca36994

14 6月, 2017 16 次提交

ixgbe: pci_set_drvdata must be called before register_netdev · a09c0fc3

由 Jeff Mahoney 提交于 6月 03, 2017

We call pci_set_drvdata immediately after calling register_netdev,
which leaves a window where tasks writing to the sriov_numvfs sysfs
attribute can sneak in and crash the kernel.  register_netdev cleans
up after itself so placing pci_set_drvdata immediately before it
should preserve the intent of commit 0fb6a55c ("ixgbe: fix crash
on rmmod after probe fail").

Fixes: 0fb6a55c ("ixgbe: fix crash on rmmod after probe fail")
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a09c0fc3

ixgbe: Resolve cppcheck format string warning · 4ebdf8af

由 Tony Nguyen 提交于 6月 01, 2017

cppcheck warns that the format string is incorrect in the function
ixgbe_get_strings().  Since the value cannot be negative, change the
variable to unsigned which matches the format specifier.
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4ebdf8af

ixgbe: fix writes to PFQDE · d28b1949

由 Emil Tantilov 提交于 5月 23, 2017

ixgbe_write_qde() was ignoring the qde parameter which resulted
in PFQDE.HIDE_VLAN not being set for X550.
Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d28b1949

ixgbevf: Bump version number · adc2c83e

由 Tony Nguyen 提交于 5月 18, 2017

Update ixgbevf version number.
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

adc2c83e

ixgbe: Bump version number · 01ec5525

由 Tony Nguyen 提交于 5月 18, 2017

Update ixgbe version number.
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

01ec5525

ixgbe: check for Tx timestamp timeouts during watchdog · 622a2ef5

由 Jacob Keller 提交于 5月 03, 2017

The ixgbe driver has logic to handle only one Tx timestamp at a time,
using a state bit lock to avoid multiple requests at once.

It may be possible, if incredibly unlikely, that a Tx timestamp event is
requested but never completes. Since we use an interrupt scheme to
determine when the Tx timestamp occurred we would never clear the state
bit in this case.

Add an ixgbe_ptp_tx_hang() function similar to the already existing
ixgbe_ptp_rx_hang() function. This function runs in the watchdog routine
and makes sure we eventually recover from this case instead of
permanently disabling Tx timestamps.

Note: there is no currently known way to cause this without hacking the
driver code to force it.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

622a2ef5

ixgbe: add statistic indicating number of skipped Tx timestamps · 4cc74c01

由 Jacob Keller 提交于 5月 03, 2017

The ixgbe driver can only handle one Tx timestamp request at a time.
This means it is possible for an application timestamp request to be
ignored.

There is no easy way for an administrator to determine if this occurred.
Add a new statistic which tracks this, tx_hwtstamp_skipped.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4cc74c01

ixgbe: avoid permanent lock of *_PTP_TX_IN_PROGRESS · 5fef124d

由 Jacob Keller 提交于 5月 03, 2017

The ixgbe driver uses a state bit lock to avoid handling more than one Tx
timestamp request at once. This is required because hardware is limited
to a single set of registers for Tx timestamps.

The state bit lock is not properly cleaned up during
ixgbe_xmit_frame_ring() if the transmit fails such as due to DMA or TSO
failure. In some hardware this results in blocking timestamps until the
service task times out. In other hardware this results in a permanent
lock of the timestamp bit because we never receive an interrupt
indicating the timestamp occurred, since indeed the packet was never
transmitted.

Fix this by checking for DMA and TSO errors in ixgbe_xmit_frame_ring() and
properly cleaning up after ourselves when these occur.
Reported-by: NReported-by: David Mirabito <davidm@metamako.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5fef124d

ixgbe: fix race condition with PTP_TX_IN_PROGRESS bits · aaebaf50

由 Jacob Keller 提交于 5月 03, 2017

Hardware related to the ixgbe driver is limited to handling a single Tx
timestamp request at a time. Thus, the driver ignores requests for Tx
timestamp while waiting for the current request to finish. It uses
a state bit lock which enforces that only one timestamp request is
honored at a time.

Unfortunately this suffers from a simple race condition. The bit lock is
not cleared until after skb_tstamp_tx() is called notifying applications
of a new Tx timestamp. Even a well behaved application sending only one
packet at a time and waiting for a response can wake up and send a new
packet before the bit lock is cleared. This results in needlessly
dropping some Tx timestamp requests.

We can fix this by unlocking the state bit as soon as we read the
Timestamp register, as this is the first point at which it is safe to
unlock.

To avoid issues with the skb pointer, we'll use a copy of the pointer
and set the global variable in the driver structure to NULL first. This
ensures that the next timestamp request does not modify our local copy
of the skb pointer.

This ensures that well behaved applications do not accidentally race
with the unlock bit. Obviously an application which sends multiple Tx
timestamp requests at once will still only timestamp one packet at
a time. Unfortunately there is nothing we can do about this.
Reported-by: NDavid Mirabito <davidm@metamako.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

aaebaf50

cxgb4: handle serial flash interrupt · 38b6ec50

由 Ganesh Goudar 提交于 6月 14, 2017

If SF bit is not cleared in PL_INT_CAUSE, subsequent non-data
interrupts are not raised.  Enable SF bit in Global Interrupt
Mask and handle it as non-fatal and hence eventually clear it.
Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38b6ec50

networking: use skb_put_zero() · aa9f979c

由 Johannes Berg 提交于 6月 13, 2017

Use the recently introduced helper to replace the pattern of
skb_put() && memset(), this transformation was done with the
following spatch:

@@
identifier p;
expression len;
expression skb;
@@
-p = skb_put(skb, len);
-memset(p, 0, len);
+p = skb_put_zero(skb, len);
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa9f979c

qed: fix dump of context data · ace17c36

由 Tayar, Tomer 提交于 6月 13, 2017

Currently when dumping a context data only word number '1' is read for the
entire context.

Fixes: c965db44 ("qed: Add support for debug data collection")
Signed-off-by: NTomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ace17c36

net: phy: Make phy_ethtool_ksettings_get return void · 5514174f

由 yuval.shaia@oracle.com 提交于 6月 13, 2017

Make return value void since function never return meaningfull value
Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5514174f

ibmvnic: Remove netdev notify for failover resets · 61d3e1d9

由 Nathan Fontenot 提交于 6月 12, 2017

When handling a driver reset due to a failover of the backing
server on the vios, doing the netdev_notify_peers() can cause
network traffic to stall or halt. Remove the netdev notify call
for failover resets.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61d3e1d9

ibmvnic: Client-initiated failover · 40c9db8a

由 Thomas Falcon 提交于 6月 12, 2017

The IBM vNIC protocol provides support for the user to initiate
a failover from the client LPAR in case the current backing infrastructure
is deemed inadequate or in an error state.

Support for two H_VIOCTL sub-commands for vNIC devices are required
to implement this function. These commands are H_GET_SESSION_TOKEN
and H_SESSION_ERR_DETECTED.

"[H_GET_SESSION_TOKEN] is used to obtain a session token from a VNIC client
adapter.  This token is opaque to the caller and is intended to be used in
tandem with the SESSION_ERROR_DETECTED vioctl subfunction."

"[H_SESSION_ERR_DETECTED] is used to report that the currently active
backing device for a VNIC client adapter is behaving poorly, and that
the hypervisor should attempt to fail over to a different backing device,
if one is available."

To provide tools access to this functionality the vNIC driver creates a
sysfs file that, when written to, will send a request to pHyp to failover
to a different backing device.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40c9db8a

net: mvpp2: enable basic 10G support · 725757ae

由 Antoine Ténart 提交于 6月 12, 2017

On GOP port 0 two MAC modes are available: GMAC and XLG. The XLG MAC is
used for 10G connectivity. This patch adds a basic 10G support by
allowing to use the XLG MAC on port 0 and by reworking the
port_enable/disable functions so that the XLG MAC is configured when
using 10G.
Signed-off-by: NAntoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

725757ae

13 6月, 2017 1 次提交

i40e: fix handling of HW ATR eviction · 6964e53f

由 Jacob Keller 提交于 6月 12, 2017

A recent commit to refactor the driver and remove the hw_disabled_flags
field accidentally introduced two regressions. First, we overwrote
pf->flags which removed various key flags including the MSI-X settings.

Additionally, it was intended that we have now two flags,
HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done,
and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere.

This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely
updates pf->flags instead of overwriting it.

Without this patch we will have many problems including disabling MSI-X
support, and we'll attempt to use HW ATR eviction on devices which do
not support it.

Fixes: 47994c11 ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19)
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6964e53f

12 6月, 2017 10 次提交

sh_eth: add support for changing MTU · 78d61022

由 Niklas Söderlund 提交于 6月 12, 2017

The hardware supports the MTU to be changed and the driver it self is
somewhat prepared to support this. This patch hooks up the callbacks to
be able to change the MTU from user-space.
Signed-off-by: NNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78d61022

net: ena: update ena driver to version 1.1.7 · e7ff7efa

由 Netanel Belgazal 提交于 6月 11, 2017

Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7ff7efa

net: ena: bug fix in lost tx packets detection mechanism · 800c55cb

由 Netanel Belgazal 提交于 6月 11, 2017

check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

800c55cb

net: ena: disable admin msix while working in polling mode · a2cc5198

由 Netanel Belgazal 提交于 6月 11, 2017

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2cc5198

net: ena: fix theoretical Rx hang on low memory systems · a3af7c18

由 Netanel Belgazal 提交于 6月 11, 2017

For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3af7c18

net: ena: add missing unmap bars on device removal · 0857d92f

由 Netanel Belgazal 提交于 6月 11, 2017

This patch also change the mapping functions to devm_ functions

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0857d92f

net: ena: fix race condition between submit and completion admin command · 661d2b0c

由 Netanel Belgazal 提交于 6月 11, 2017

Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

661d2b0c

net: ena: add missing return when ena_com_get_io_handlers() fails · 2d2c600a

由 Netanel Belgazal 提交于 6月 11, 2017

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d2c600a

net: ena: fix bug that might cause hang after consecutive open/close interface. · 418df30f

由 Netanel Belgazal 提交于 6月 11, 2017

Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

418df30f

net: ena: fix rare uncompleted admin command false alarm · a77c1aaf

由 Netanel Belgazal 提交于 6月 11, 2017

The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3e ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a77c1aaf

11 6月, 2017 4 次提交

net/mlx5: Enable 4K UAR only when page size is bigger than 4K · 91828bd8

由 Majd Dibbiny 提交于 5月 28, 2017

When the page size isn't bigger than 4K, there is no added value of enabling 4K
UAR feature in the Firmware.

Modified the condition of enabling the 4K UAR accordingly.

Fixes: f502d834 ("net/mlx5: Activate support for 4K UARs")
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

91828bd8

net/mlx5e: Fix wrong indications in DIM due to counter wraparound · 53acd76c

由 Tal Gilboa 提交于 5月 29, 2017

DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Each iteration of the algorithm, DIM calculates the difference in
throughput, packet rate and interrupt rate from last iteration in order
to make a decision. DIM relies on counters for each metric. When these
counters get to their type's max value they wraparound. In this case
the delta between 'end' and 'start' samples is negative and when
translated to unsigned integers - very high. This results in a false
indication to the algorithm and might result in a wrong decision.

The fix calculates the 'distance' between 'end' and 'start' samples in a
cyclic way around the relevant type's max value. It can also be viewed as
an absolute value around the type's max value instead of around 0.

Testing show higher stability in DIM profile selection and no wraparound
issues.

Fixes: cb3c7fd4 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: NTal Gilboa <talgi@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

53acd76c

net/mlx5e: Added BW check for DIM decision mechanism · c3164d2f

由 Tal Gilboa 提交于 5月 15, 2017

DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Until now only interrupt and packet rate were sampled.
We found a scenario on which we get a false indication since a change in
DIM caused more aggregation and reduced packet rate while increasing BW.

We now regard a change as succesfull iff:
current_BW > (prev_BW + threshold) or
current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
current_BW ~= prev_BW and current_PR ~= prev_PR and
    current_IR < (prev_IR - threshold)
Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate

Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
    --------------------------------------------------
    packet size | before[Mb/s] | after[Mb/s] | gain  |
    2B          | 343.4        | 359.4       |  4.5% |
    16B         | 2739.7       | 2814.8      |  2.7% |
    64B         | 9739         | 10185.3     |  4.5% |

Fixes: cb3c7fd4 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: NTal Gilboa <talgi@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

c3164d2f

net/mlx5: Remove several module events out of ethtool stats · f729860a

由 Huy Nguyen 提交于 5月 08, 2017

Remove the following module event counters out of ethtool stats. The
reason for removing these event counters is that these events do not
occur without techinician's intervention.
  module_pwr_budget_exd
  module_long_range
  module_no_eeprom
  module_enforce_part
  module_unknown_id
  module_unknown_status
  module_plug

Fixes: bedb7c90 ("net/mlx5e: Add port module event counters to ethtool stats")
Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
Reviewed by: Gal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

f729860a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功