提交 · ac6d1835ca964b053f051ecedd44ecb3de0f1b33 · openeuler / Kernel

01 8月, 2020 2 次提交

由 Jesse Brandeburg 提交于 7月 29, 2020

Display and count some useful hot-path statistics. The usefulness is as
follows:

- tx_restart: use to determine if the transmit ring size is too small or
  if the transmit interrupt rate is too low.
- rx_gro_dropped: use to count drops from GRO layer, which previously were
  completely uncounted when occurring.
- tx_busy: use to determine when the driver is miscounting number of
  descriptors needed for an skb.
- tx_timeout: as our other drivers, count the number of times we've reset
  due to timeout because the kernel only prints a warning once per netdev.

Several of these were already counted but not displayed.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

a8fffd7a

ice: remove page_reuse statistic · a4c493fe

由 Jesse Brandeburg 提交于 7月 29, 2020

The page reuse statistic wasn't even being displayed to the user, even
though the driver counted it. Don't waste the struct space and hot-path
cycles since the driver doesn't display it.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

a4c493fe

28 5月, 2020 1 次提交

ice: fix signed vs unsigned comparisons · 22bef5e7

由 Jesse Brandeburg 提交于 5月 15, 2020

Fix the remaining signed vs unsigned issues, which appear
when compiling with -Werror=sign-compare.

Many of these are because there is an external interface that is passing
an int to us (which we can't change) but that we (rightfully) store
and compare against as an unsigned in our data structures.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

22bef5e7

23 5月, 2020 2 次提交

ice: Support IPv4 Flow Director filters · cac2a27c

由 Henry Tieman 提交于 5月 11, 2020

Support the addition and deletion of IPv4 filters.

Supported fields are: src-ip, dst-ip, src-port, and dst-port
Supported flow-types are: tcp4, udp4, sctp4, ip4

Example usage:

ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \
src-port 16 dst-port 12 action 32
Signed-off-by: NHenry Tieman <henry.w.tieman@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

cac2a27c

ice: Initialize Flow Director resources · 148beb61

由 Henry Tieman 提交于 5月 11, 2020

Flow Director allows for redirection based on ntuple rules. Rules are
programmed using the ethtool set-ntuple interface. Supported actions are
redirect to queue and drop.

Setup the initial framework to process Flow Director filters. Create and
allocate resources to manage and program filters to the hardware. Filters
are processed via a sideband interface; a control VSI is created to manage
communication and process requests through the sideband. Upon allocation of
resources, update the hardware tables to accept perfect filters.
Signed-off-by: NHenry Tieman <henry.w.tieman@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

148beb61

22 5月, 2020 2 次提交

ice: Add support for tunnel offloads · a4e82a81

由 Tony Nguyen 提交于 5月 06, 2020

Create a boost TCAM entry for each tunnel port in order to get a tunnel
PTYPE. Update netdev feature flags and implement the appropriate logic to
get and set values for hardware offloads.
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NHenry Tieman <henry.w.tieman@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a4e82a81

ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL · 175fc430

由 Björn Töpel 提交于 5月 20, 2020

Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.

v4->v5: Fixed "warning: Excess function parameter 'alloc' description
        in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function
        parameter 'xdp' description in
        'ice_construct_skb_zc'". (Jakub)
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com

175fc430

20 2月, 2020 1 次提交

ice: Don't reject odd values of usecs set by user · 840f8ad0

由 Brett Creeley 提交于 2月 13, 2020

Currently if a user sets an odd [tx|rx]-usecs value through ethtool,
the request is denied because the hardware is set to have an ITR
granularity of 2us. This caused poor customer experience. Fix this by
aligning to a register allowed value, which results in rounding down.
Also, print a once per ring container type message to be clear about
our intentions.

Also, change the ITR_TO_REG define to be the bitwise and of the ITR
setting and the ICE_ITR_MASK. This makes the purpose of ITR_TO_REG more
obvious.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

840f8ad0

13 2月, 2020 1 次提交

ice: Trivial fixes · 4ee656bb

由 Tony Nguyen 提交于 2月 06, 2020

This is a collection of trivial fixes including fixing whitespace, typos,
function headers, reverse Christmas tree, etc.
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4ee656bb

04 1月, 2020 1 次提交

ice: Restore interrupt throttle settings after VSI rebuild · 61dc79ce

由 Michal Swiatkowski 提交于 12月 12, 2019

After each rebuild driver deallocates q_vectors, so the interrupt
throttle rate (ITR) settings get lost.

Create a function to save and restore ITR for each queue. If a user
increases the number of queues, restore all the previous queue
settings for each existing queue, and the additional queues will
get the default setting.
Signed-off-by: NMichal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

61dc79ce

05 11月, 2019 6 次提交

ice: introduce frame padding computation logic · 59bb0808

由 Maciej Fijalkowski 提交于 10月 24, 2019

Take into account the underlying architecture specific settings and
based on that calculate the possible padding that can be supplied.
Typically, for x86 and standard MTU size we will end up with 192 bytes
of headroom. This is the same behavior as our other drivers have and we
can dedicate it for XDP purposes.

Furthermore, introduce the Rx ring flag for indicating whether build_skb
is used on particular. Based on that invoke the routines for padding
calculation.
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

59bb0808

ice: introduce legacy Rx flag · 7237f5b0

由 Maciej Fijalkowski 提交于 10月 24, 2019

Add an ethtool "legacy-rx" priv flag for toggling the Rx path. This
control knob will be mainly used for build_skb usage as well as buffer
size/MTU manipulation.

In preparation for adding build_skb support in a way that it takes
care of how we set the values of max_frame and rx_buf_len fields of
struct ice_vsi. Specifically, in this patch mentioned fields are set to
values that will allow us to provide headroom and tailroom in-place.

This can be mostly broken down onto following:
- for legacy-rx "on" ethtool control knob, old behaviour is kept;
- for standard 1500 MTU size configure the buffer of size 1536, as
  network stack is expecting the NET_SKB_PAD to be provided and
  NET_IP_ALIGN can have a non-zero value (these can be typically equal
  to 32 and 2, respectively);
- for larger MTUs go with max_frame set to 9k and configure the 3k
  buffer in case when PAGE_SIZE of underlying arch is less than 8k; 3k
  buffer is implying the need for order 1 page, so that our page
  recycling scheme can still be applied;

With that said, substitute the hardcoded ICE_RXBUF_2048 and PAGE_SIZE
values in DMA API that we're making use of with rx_ring->rx_buf_len and
ice_rx_pg_size(rx_ring). The latter is an introduced helper for
determining the page size based on its order (which was figured out via
ice_rx_pg_order). Last but not least, take care of truesize calculation.

In the followup patch the headroom/tailroom computation logic will be
introduced.

This change aligns the buffer and frame configuration with other Intel
drivers, most importantly with iavf.
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

7237f5b0

ice: Add support for AF_XDP · 2d4238f5

由 Krzysztof Kazimierczak 提交于 11月 04, 2019

Add zero copy AF_XDP support.  This patch adds zero copy support for
Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c.

For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse
existing XDP Tx queues for this task, since XDP_REDIRECT guarantees
mutual exclusion between different NAPI contexts based on CPU ID. In
turn, a netdev can XDP_REDIRECT to another netdev with a different
NAPI context, since the operation is bound to a specific core and each
core has its own hardware ring.

For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is
enabled.
Signed-off-by: NKrzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Co-developed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2d4238f5

ice: Move common functions to ice_txrx_lib.c · 0891d6d4

由 Krzysztof Kazimierczak 提交于 11月 04, 2019

In preparation of AF XDP, move functions that will be used both by skb and
zero-copy paths to a new file called ice_txrx_lib.c. This allows us to
avoid using ifdefs to control the staticness of said functions.

Move other functions (ice_rx_csum, ice_rx_hash and ice_ptype_to_htype)
called only by the moved ones to the new file as well.
Signed-off-by: NKrzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

0891d6d4

ice: Add support for XDP · efc2214b

由 Maciej Fijalkowski 提交于 11月 04, 2019

Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit.  Upon load of
an XDP program, allocate additional Tx rings for dedicated XDP use.
The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT,
XDP_PASS, and XDP_ABORTED.
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

efc2214b

ice: Introduce ice_base.c · eff380aa

由 Anirudh Venkataramanan 提交于 10月 24, 2019

Remove a few uses of kernel configuration flags from ice_lib.c by
introducing a new source file ice_base.c. Also move corresponding
function prototypes from ice_lib.h to ice_base.h and include ice_base.h
where required.
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

eff380aa

21 8月, 2019 1 次提交

ice: Set WB_ON_ITR when we don't re-enable interrupts · 2ab28bb0

由 Brett Creeley 提交于 7月 25, 2019

Currently when busy polling is enabled we aren't setting/enabling
WB_ON_ITR in the driver. This doesn't break the driver, but it does
cause issues. If we don't enable WB_ON_ITR mode we will still get
write-backs from hardware during polling when a cache line has been
filled, but if a cache line is not filled we will not get the
write-back because WB_ON_ITR is not set. Fix this by enabling
WB_ON_ITR in the driver when interrupts are disabled.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2ab28bb0

24 5月, 2019 2 次提交

ice: Use bitfields when possible · 0ab54c5f

由 Jesse Brandeburg 提交于 4月 16, 2019

We can use bit fields to store boolean values and when the
bit fields are next to each other, the compiler will combine them
(as long as the size holds enough).
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

0ab54c5f

ice: Reorganize tx_buf and ring structs · 65124bbf

由 Jesse Brandeburg 提交于 4月 16, 2019

Use more efficient structure ordering by using the pahole tool
and a lot of code inspection to get hot cache lines to have
packed data (no holes if possible) and adjacent warm data.

ice_ring prior to this change:
  /* size: 192, cachelines: 3, members: 23 */
  /* sum members: 158, holes: 4, sum holes: 12 */
  /* padding: 22 */

ice_ring after this change:
  /* size: 192, cachelines: 3, members: 25 */
  /* sum members: 162, holes: 1, sum holes: 1 */
  /* padding: 29 */

ice_tx_buf prior to this change:
  /* size: 48, cachelines: 1, members: 7 */
  /* sum members: 38, holes: 2, sum holes: 6 */
  /* padding: 4 */
  /* last cacheline: 48 bytes */

ice_tx_buf after this change:
  /* size: 40, cachelines: 1, members: 7 */
  /* sum members: 38, holes: 1, sum holes: 2 */
  /* last cacheline: 40 bytes */
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

65124bbf

02 5月, 2019 1 次提交

ice: Add ability to update rx-usecs-high · b9c8bb06

由 Brett Creeley 提交于 2月 28, 2019

Currently the driver allows rx-usecs-high values to be set,
but when querying the device for rx-usecs-high the value
does not stick. This is because it was not yet implemented.
Add code to allow the user to change rx-usecs-high and
use this to set the q_vector's intrl value.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

b9c8bb06

18 4月, 2019 2 次提交

ice: Add priority information into VLAN header · 5f6aa50e

由 Anirudh Venkataramanan 提交于 2月 28, 2019

This patch introduces a new function ice_tx_prepare_vlan_flags_dcb to
insert 802.1p priority information into the VLAN header
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5f6aa50e

ice: Update rings based on TC information · a629cf0a

由 Anirudh Venkataramanan 提交于 2月 28, 2019

This patch adds a new function ice_vsi_cfg_dcb_rings which updates a
VSI's rings based on DCB traffic class information.
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a629cf0a

27 3月, 2019 3 次提交

ice: Update comment regarding the ITR_GRAN_S · 92414f32

由 Brett Creeley 提交于 2月 19, 2019

Since the driver now hard codes the ITR granularity to 2 us in the
GLINT_CTL register the comment next to ITR_GRAN_S needs to be updated.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

92414f32

ice: Audit hotpath structures with pahole · 8244dd2d

由 Brett Creeley 提交于 2月 19, 2019

Currently the ice_q_vector structure and ice_ring_container structure
are taking up more space than necessary due to cache alignment holes
and unnecessary variables respectively. This is not helping the
driver's performance. The following fixes were done to improve cache
alignment, reduce wasted space, and increase performance.

1. Remove the ice_latency_range enum as it is unused.
2. Remove the latency_range variable in the ice_ring_container structure.
3. Change the size of the itr_idx in the ice_ring_container structure
from an int to an u16. This reduced the size of ice_ring_container
structure to 32 Bytes so it has no holes or padding.
4. Re-arrange the ice_q_vector structure using pahole to align
members as best as possible in regards to 64 Byte cache line size.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8244dd2d

ice: Fix for adaptive interrupt moderation · 64a59d05

由 Anirudh Venkataramanan 提交于 2月 19, 2019

commit 63f545ed ("ice: Add support for adaptive interrupt moderation")
was meant to add support for adaptive interrupt moderation but there was
an error on my part while formatting the patch, and thus only part of the
patch ended up being submitted.

This patch rectifies the error by adding the rest of the code.

Fixes: 63f545ed ("ice: Add support for adaptive interrupt moderation")
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

64a59d05

26 3月, 2019 2 次提交

ice: map Rx buffer pages with DMA attributes · a65f71fe

由 Maciej Fijalkowski 提交于 2月 13, 2019

Provide DMA_ATTR_WEAK_ORDERING and DMA_ATTR_SKIP_CPU_SYNC attributes to
the DMA API during the mapping operations on Rx side. With this change
the non-x86 platforms will be able to sync only with what is being used
(2k buffer) instead of entire page. This should yield a slight
performance improvement.

Furthermore, DMA unmap may destroy the changes that were made to the
buffer by CPU when platform is not a x86 one. DMA_ATTR_SKIP_CPU_SYNC
attribute usage fixes this issue.

Also add a sync_single_for_device call during the Rx buffer assignment,
to make sure that the cache lines are cleared before device attempting
to write to the buffer.
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a65f71fe

ice: Introduce bulk update for page count · 03c66a13

由 Maciej Fijalkowski 提交于 2月 13, 2019

{get,put}_page are atomic operations which we use for page count
handling. The current logic for refcount handling is that we increment
it when passing a skb with the data from the first half of page up to
netstack and recycle the second half of page. This operation protects us
from losing a page since the network stack can decrement the refcount of
page from skb.

The performance can be gently improved by doing the bulk updates of
refcount instead of doing it one by one. During the buffer initialization,
maximize the page's refcount and don't allow the refcount to become
less than two.
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

03c66a13

20 3月, 2019 1 次提交

ice: configure GLINT_ITR to always have an ITR gran of 2 · 70457520

由 Brett Creeley 提交于 2月 08, 2019

Instead of hoping that our ITR granularity will be 2 usec program the
GLINT_CTL register to make sure the ITR granularity is always 2 usecs.

Now that we know what the ITR granularity will be get rid of the check
in ice_probe() to verify our previous assumption.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

70457520

16 1月, 2019 2 次提交

ice: Implement getting and setting ethtool coalesce · 67fe64d7

由 Brett Creeley 提交于 12月 19, 2018

This patch includes the following ethtool operations:

1. get_coalesce
2. set_coalesce
3. get_per_q_coalesce
4. set_per_q_coalesce

Each ITR value (current_itr/target_itr) are stored on a per
ice_ring_container basis. This is because each valid ice_ring_container
can have 1 or more rings that are tied to the same q_vector ITR index.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

67fe64d7

ice: Add support for adaptive interrupt moderation · 63f545ed

由 Brett Creeley 提交于 12月 19, 2018

Currently the driver does not support adaptive/dynamic interrupt
moderation. This patch adds support for this. Also, adaptive/dynamic
interrupt moderation is turned on by default upon driver load.

In order to support adaptive interrupt moderation, two functions were
added, ice_update_itr() and ice_itr_divisor(). These are used to
determine the current packet load and to determine a divisor based
on link speed respectively.

This patch also adds the ICE_ITR_GRAN_S define that is used in the
hot-path when setting a new ITR value. The shift is used to pet two
birds with one hand, set the ITR value while re-enabling the
interrupt. Also, the ICE_ITR_GRAN_S is defined as 1 because the device
has a ITR granularity of 2usecs.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

63f545ed

07 11月, 2018 1 次提交

ice: Fix tx_timeout in PF driver · c585ea42

由 Brett Creeley 提交于 10月 26, 2018

Prior to this commit the driver was running into tx_timeouts when a
queue was stressed enough. This was happening because the HW tail
and SW tail (NTU) were incorrectly out of sync. Consequently this was
causing the HW head to collide with the HW tail, which to the hardware
means that all descriptors posted for Tx have been processed.

Due to the Tx logic used in the driver SW tail and HW tail are allowed
to be out of sync. This is done as an optimization because it allows the
driver to write HW tail as infrequently as possible, while still
updating the SW tail index to keep track. However, there are situations
where this results in the tail never getting updated, resulting in Tx
timeouts.

Tx HW tail write condition:
	if (netif_xmit_stopped(txring_txq(tx_ring) || !skb->xmit_more)
		writel(sw_tail, tx_ring->tail);

An issue was found in the Tx logic that was causing the afore mentioned
condition for updating HW tail to never happen, causing tx_timeouts.

In ice_xmit_frame_ring we calculate how many descriptors we need for the
Tx transaction based on the skb the kernel hands us. This is then passed
into ice_maybe_stop_tx along with some extra padding to determine if we
have enough descriptors available for this transaction. If we don't then
we return -EBUSY to the stack, otherwise we move on and eventually
prepare the Tx descriptors accordingly in ice_tx_map and set
next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx
with a value of MAX_SKB_FRAGS + 4. The key here is that this value is
possibly less than the value we sent in the first call to
ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused
descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first
call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update
the HW tail because of the "Tx HW tail write condition" above. This is
because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx
instead of calling __ice_maybe_stop_tx and subsequently calling
netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This
bit is then checked in the "Tx HW tail write condition" by calling
netif_xmit_stopped and subsequently updating HW tail if the
afore mentioned bit is set.

In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning
the descriptors that HW sets the DD bit on and we have the budget. The
HW head will eventually run into the HW tail in response to the
description in the paragraph above.

The next time through ice_xmit_frame_ring we make the initial call to
ice_maybe_stop_tx with another skb from the stack. This time we do not
have enough descriptors available and we return NETDEV_TX_BUSY to the
stack and end up setting next_to_watch to NULL.

This is where we are stuck. In ice_clean_tx_irq we never clean anything
because next_to_watch is always NULL and in ice_xmit_frame_ring we never
update HW tail because we already return NETDEV_TX_BUSY to the stack and
eventually we hit a tx_timeout.

This issue was fixed by making sure that the second call to
ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value
that was used on the initial call to ice_maybe_stop_tx in
ice_xmit_frame_ring. This was done by adding the following defines to
make the logic more clear and to reduce the chance of mucking this up
again:

ICE_CACHE_LINE_BYTES		64
ICE_DESCS_PER_CACHE_LINE	(ICE_CACHE_LINE_BYTES / \
				 sizeof(struct ice_tx_desc))
ICE_DESCS_FOR_CTX_DESC		1
ICE_DESCS_FOR_SKB_DATA_PTR	1

The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we
don't have to figure this out on every pass through the Tx path. Instead
I added a sanity check in ice_probe to verify cache line size and print
a message if it's not 64 Bytes. This will make it easier to file issues
if they are seen when the cache line size is not 64 Bytes when reading
from the GLPCI_CNF2 register.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c585ea42

03 10月, 2018 1 次提交

ice: Add more flexibility on how we assign an ITR index · d2b464a7

由 Brett Creeley 提交于 9月 19, 2018

This issue came about when looking at the VF function
ice_vc_cfg_irq_map_msg. Currently we are assigning the itr_setting value
to the itr_idx received from the AVF driver, which is not correct and is
not used for the VF flow anyway. Currently the only way we set the ITR
index for both the PF and VF driver is by hard coding ICE_TX_ITR or
ICE_RX_ITR for the ITR index on each q_vector.

To fix this, add the member itr_idx in struct ice_ring_container. This
can then be used to dynamically program the correct ITR index. This change
also affected the PF driver so make the necessary changes there as well.

Also, removed the itr_setting member in struct ice_ring because it is not
being used meaningfully and is going to be removed in a future patch that
includes dynamic ITR.

On another note, this will be useful moving forward if we decide to split
Rx/Tx rings on different q_vectors instead of sharing them as queue pairs.
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d2b464a7

02 10月, 2018 1 次提交

ice: Add support for dynamic interrupt moderation · 9e4ab4c2

由 Brett Creeley 提交于 9月 19, 2018

Currently there is no support for dynamic interrupt moderation. This
patch adds some initial code to support this. The following changes
were made:

1. Currently we are using multiple members to store the interrupt
   granularity (itr_gran_25/50/100/200). This is not necessary because
   we can query the device to determine what the interrupt granularity
   should be set to, done by a new function ice_get_itr_intrl_gran.

2. Added intrl to ice_q_vector structure to support interrupt rate
   limiting.

3. Added the function ice_intrl_usecs_to_reg for converting to a value
   in usecs that the device understands.

4. Added call to write to the GLINT_RATE register. Disable intrl by
   default for now.

5. Changed rx/tx_itr_setting to itr_setting because having both seems
   redundant because a ring is either Tx or Rx.

6. Initialize itr_setting for both Tx/Rx rings in ice_vsi_alloc_rings()
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9e4ab4c2

29 8月, 2018 1 次提交

ice: Add support for Tx hang, Tx timeout and malicious driver detection · b3969fd7

由 Sudheer Mogilappagari 提交于 8月 09, 2018

When a malicious operation is detected, the firmware triggers an
interrupt, which is then picked up by the service task (specifically by
ice_handle_mdd_event). A reset is scheduled if required.

Tx hang detection works in a similar way, except the logic here monitors
the VSI's Tx queues and tries to revive them if stalled. If the hang is
not resolved, the kernel eventually calls ndo_tx_timeout, which is
handled by ice_tx_timeout.
Signed-off-by: NSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

b3969fd7

24 8月, 2018 1 次提交

ice: Change struct members from bool to u8 · 43f8b224

由 Bruce Allan 提交于 8月 09, 2018

Recent versions of checkpatch have a new warning based on a documented
preference of Linus to not use bool in structures due to wasted space and
the size of bool is implementation dependent. For more information, see
the email thread at https://lkml.org/lkml/2017/11/21/384.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

43f8b224

27 3月, 2018 5 次提交

ice: Add support for VLANs and offloads · d76a60ba

由 Anirudh Venkataramanan 提交于 3月 20, 2018

This patch adds support for VLANs. When a VLAN is created a switch filter
is added to direct the VLAN traffic to the corresponding VSI. When a VLAN
is deleted, the filter is deleted as well.

This patch also adds support for the following hardware offloads.
    1) VLAN tag insertion/stripping
    2) Receive Side Scaling (RSS)
    3) Tx checksum and TCP segmentation
    4) Rx checksum
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d76a60ba

ice: Implement transmit and NAPI support · 2b245cb2

由 Anirudh Venkataramanan 提交于 3月 20, 2018

This patch implements ice_start_xmit (the handler for ndo_start_xmit) and
related functions. ice_start_xmit ultimately calls ice_tx_map, where the
Tx descriptor is built and posted to the hardware by bumping the ring tail.

This patch also implements ice_napi_poll, which is invoked when there's an
interrupt on the VSI's queues. The interrupt can be due to either a
completed Tx or an Rx event. In case of a completed Tx/Rx event, resources
are reclaimed. Additionally, in case of an Rx event, the skb is fetched
and passed up to the network stack.
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2b245cb2

ice: Configure VSIs for Tx/Rx · cdedef59

由 Anirudh Venkataramanan 提交于 3月 20, 2018

This patch configures the VSIs to be able to send and receive
packets by doing the following:

1) Initialize flexible parser to extract and include certain
   fields in the Rx descriptor.

2) Add Tx queues by programming the Tx queue context (implemented in
   ice_vsi_cfg_txqs). Note that adding the queues also enables (starts)
   the queues.

3) Add Rx queues by programming Rx queue context (implemented in
   ice_vsi_cfg_rxqs). Note that this only adds queues but doesn't start
   them. The rings will be started by calling ice_vsi_start_rx_rings on
   interface up.

4) Configure interrupts for VSI queues.

5) Implement ice_open and ice_stop.
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

cdedef59

ice: Add support for VSI allocation and deallocation · 3a858ba3

由 Anirudh Venkataramanan 提交于 3月 20, 2018

This patch introduces data structures and functions to alloc/free
VSIs. The driver represents a VSI using the ice_vsi structure.

Some noteworthy points about VSI allocation:

1) A VSI is allocated in the firmware using the "add VSI" admin queue
command (implemented as ice_aq_add_vsi). The firmware returns an
identifier for the allocated VSI. The VSI context is used to program
certain aspects (loopback, queue map, etc.) of the VSI's configuration.

2) A VSI is deleted using the "free VSI" admin queue command (implemented
as ice_aq_free_vsi).

3) The driver represents a VSI using struct ice_vsi. This is allocated
and initialized as part of the ice_vsi_alloc flow, and deallocated
as part of the ice_vsi_delete flow.

4) Once the VSI is created, a netdev is allocated and associated with it.
The VSI's ring and vector related data structures are also allocated
and initialized.

5) A VSI's queues can either be contiguous or scattered. To do this, the
driver maintains a bitmap (vsi->avail_txqs) which is kept in sync with
the firmware's VSI queue allocation imap. If the VSI can't get a
contiguous queue allocation, it will fallback to scatter. This is
implemented in ice_vsi_get_qs which is called as part of the VSI setup
flow. In the release flow, the VSI's queues are released and the bitmap
is updated to reflect this by ice_vsi_put_qs.

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: NShannon Nelson <shannon.nelson@oracle.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3a858ba3

ice: Initialize PF and setup miscellaneous interrupt · 940b61af

由 Anirudh Venkataramanan 提交于 3月 20, 2018

This patch continues the initialization flow as follows:

1) Allocate and initialize necessary fields (like vsi, num_alloc_vsi,
   irq_tracker, etc) in the ice_pf instance.

2) Setup the miscellaneous interrupt handler. This also known as the
   "other interrupt causes" (OIC) handler and is used to handle non
   hotpath interrupts (like control queue events, link events,
   exceptions, etc.

3) Implement a background task to process admin queue receive (ARQ)
   events received by the driver.

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: NShannon Nelson <shannon.nelson@oracle.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

940b61af

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功