iavf.rst 13.3 KB
Newer Older
1 2
.. SPDX-License-Identifier: GPL-2.0+

3 4 5
=================================================================
Linux Base Driver for Intel(R) Ethernet Adaptive Virtual Function
=================================================================
6

7
Intel Ethernet Adaptive Virtual Function Linux driver.
8
Copyright(c) 2013-2018 Intel Corporation.
9 10 11 12

Contents
========

13
- Overview
14
- Identifying Your Adapter
15
- Additional Configurations
16 17 18
- Known Issues/Troubleshooting
- Support

19 20 21
Overview
========

22
This file describes the iavf Linux Base Driver. This driver was formerly
23
called i40evf.
24

25 26 27 28
The iavf driver supports the below mentioned virtual function devices and
can only be activated on kernels running the i40e or newer Physical Function
(PF) driver compiled with CONFIG_PCI_IOV.  The iavf driver requires
CONFIG_PCI_MSI to be enabled.
29

30
The guest OS loading the iavf driver must support MSI-X interrupts.
31 32 33

Identifying Your Adapter
========================
34

35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
The driver in this kernel is compatible with devices based on the following:
 * Intel(R) XL710 X710 Virtual Function
 * Intel(R) X722 Virtual Function
 * Intel(R) XXV710 Virtual Function
 * Intel(R) Ethernet Adaptive Virtual Function

For the best performance, make sure the latest NVM/FW is installed on your
device.

For information on how to identify your adapter, and for the latest NVM/FW
images and Intel network drivers, refer to the Intel Support website:
http://www.intel.com/support


Additional Features and Configurations
======================================

Viewing Link Messages
---------------------
Link messages will not be displayed to the console if the distribution is
restricting system messages. In order to see network driver link messages on
your console, set dmesg to eight by entering the following::

58
    # dmesg -n 8
59

60 61
NOTE:
  This setting is not saved across reboots.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

ethtool
-------
The driver utilizes the ethtool interface for driver configuration and
diagnostics, as well as displaying statistical information. The latest ethtool
version is required for this functionality. Download it at:
https://www.kernel.org/pub/software/network/ethtool/

Setting VLAN Tag Stripping
--------------------------
If you have applications that require Virtual Functions (VFs) to receive
packets with VLAN tags, you can disable VLAN tag stripping for the VF. The
Physical Function (PF) processes requests issued from the VF to enable or
disable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF,
then requests from that VF to set VLAN tag stripping will be ignored.

To enable/disable VLAN tag stripping for a VF, issue the following command
from inside the VM in which you are running the VF::

81
    # ethtool -K <if_name> rxvlan on/off
82 83 84

or alternatively::

85
    # ethtool --offload <if_name> rxvlan on/off
86 87 88 89 90 91 92 93 94 95 96 97 98 99

Adaptive Virtual Function
-------------------------
Adaptive Virtual Function (AVF) allows the virtual function driver, or VF, to
adapt to changing feature sets of the physical function driver (PF) with which
it is associated. This allows system administrators to update a PF without
having to update all the VFs associated with it. All AVFs have a single common
device ID and branding string.

AVFs have a minimum set of features known as "base mode," but may provide
additional features depending on what features are available in the PF with
which the AVF is associated. The following are base mode features:

- 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs)
100 101 102 103 104 105 106
  for Tx/Rx
- i40e descriptors and ring format
- Descriptor write-back completion
- 1 control queue, with i40e descriptors, CSRs and ring format
- 5 MSI-X interrupt vectors and corresponding i40e CSRs
- 1 Interrupt Throttle Rate (ITR) index
- 1 Virtual Station Interface (VSI) per VF
107 108
- 1 Traffic Class (TC), TC0
- Receive Side Scaling (RSS) with 64 entry indirection table and key,
109 110 111 112 113 114
  configured through the PF
- 1 unicast MAC address reserved per VF
- 16 MAC address filters for each VF
- Stateless offloads - non-tunneled checksums
- AVF device ID
- HW mailbox is used for VF to PF communications (including on Windows)
115 116 117 118 119 120 121 122 123 124 125

IEEE 802.1ad (QinQ) Support
---------------------------
The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN
IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as
"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks
allow L2 tunneling and the ability to segregate traffic within a particular
VLAN ID, among other uses.

The following are examples of how to configure 802.1ad (QinQ)::

126 127
    # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
    # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
128 129 130 131 132 133 134 135 136 137 138 139 140 141

Where "24" and "371" are example VLAN IDs.

NOTES:
  Receive checksum offloads, cloud filters, and VLAN acceleration are not
  supported for 802.1ad (QinQ) packets.

Application Device Queues (ADq)
-------------------------------
Application Device Queues (ADq) allows you to dedicate one or more queues to a
specific application. This can reduce latency for the specified application,
and allow Tx traffic to be rate limited per application. Follow the steps below
to set ADq.

142 143 144 145 146 147 148 149 150 151 152 153 154
Requirements:

- The sch_mqprio, act_mirred and cls_flower modules must be loaded
- The latest version of iproute2
- If another driver (for example, DPDK) has set cloud filters, you cannot
  enable ADQ
- Depending on the underlying PF device, ADQ cannot be enabled when the
  following features are enabled:

  + Data Center Bridging (DCB)
  + Multiple Functions per Port (MFP)
  + Sideband Filters

155 156 157 158 159
1. Create traffic classes (TCs). Maximum of 8 TCs can be created per interface.
The shaper bw_rlimit parameter is optional.

Example: Sets up two tcs, tc0 and tc1, with 16 queues each and max tx rate set
to 1Gbit for tc0 and 3Gbit for tc1.
160

161
::
162

163 164 165
    tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1
    queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit
    max_rate 1Gbit 3Gbit
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183

map: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1
sets priorities 0-3 to use tc0 and 4-7 to use tc1)

queues: for each tc, <num queues>@<offset> (e.g. queues 16@0 16@16 assigns
16 queues to tc0 at offset 0 and 16 queues to tc1 at offset 16. Max total
number of queues for all tcs is 64 or number of cores, whichever is lower.)

hw 1 mode channel: ‘channel’ with ‘hw’ set to 1 is a new new hardware
offload mode in mqprio that makes full use of the mqprio options, the
TCs, the queue configurations, and the QoS parameters.

shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates.
Totals must be equal or less than port speed.

For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
monitoring tools such as ifstat or sar –n DEV [interval] [number of samples]

184 185 186 187
NOTE:
  Setting up channels via ethtool (ethtool -L) is not supported when the
  TCs are configured using mqprio.

188 189 190 191 192 193 194 195 196
2. Enable HW TC offload on interface::

    # ethtool -K <interface> hw-tc-offload on

3. Apply TCs to ingress (RX) flow of interface::

    # tc qdisc add dev <interface> ingress

NOTES:
197 198
 - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory
 - ADq is not compatible with cloud filters
199
 - Setting up channels via ethtool (ethtool -L) is not supported when the TCs
200
   are configured using mqprio
201
 - You must have iproute2 latest version
202
 - NVM version 6.01 or later is required
203
 - ADq cannot be enabled when any the following features are enabled: Data
204
   Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters
205
 - If another driver (for example, DPDK) has set cloud filters, you cannot
206
   enable ADq
207 208 209 210 211 212 213 214 215 216 217 218
 - Tunnel filters are not supported in ADq. If encapsulated packets do arrive
   in non-tunnel mode, filtering will be done on the inner headers.  For example,
   for VXLAN traffic in non-tunnel mode, PCTYPE is identified as a VXLAN
   encapsulated packet, outer headers are ignored. Therefore, inner headers are
   matched.
 - If a TC filter on a PF matches traffic over a VF (on the PF), that traffic
   will be routed to the appropriate queue of the PF, and will not be passed on
   the VF. Such traffic will end up getting dropped higher up in the TCP/IP
   stack as it does not match PF address data.
 - If traffic matches multiple TC filters that point to different TCs, that
   traffic will be duplicated and sent to all matching TC queues.  The hardware
   switch mirrors the packet to a VSI list when multiple filters are matched.
219

220 221 222 223

Known Issues/Troubleshooting
============================

224 225 226 227 228 229 230 231 232 233
Bonding fails with VFs bound to an Intel(R) Ethernet Controller 700 series device
---------------------------------------------------------------------------------
If you bind Virtual Functions (VFs) to an Intel(R) Ethernet Controller 700
series based device, the VF slaves may fail when they become the active slave.
If the MAC address of the VF is set by the PF (Physical Function) of the
device, when you add a slave, or change the active-backup slave, Linux bonding
tries to sync the backup slave's MAC address to the same MAC address as the
active slave. Linux bonding will fail at this point. This issue will not occur
if the VF's MAC address is not set by the PF.

234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
Traffic Is Not Being Passed Between VM and Client
-------------------------------------------------
You may not be able to pass traffic between a client system and a
Virtual Machine (VM) running on a separate host if the Virtual Function
(VF, or Virtual NIC) is not in trusted mode and spoof checking is enabled
on the VF. Note that this situation can occur in any combination of client,
host, and guest operating system. For information on how to set the VF to
trusted mode, refer to the section "VLAN Tag Packet Steering" in this
readme document. For information on setting spoof checking, refer to the
section "MAC and VLAN anti-spoofing feature" in this readme document.

Do not unload port driver if VF with active VM is bound to it
-------------------------------------------------------------
Do not unload a port's driver if a Virtual Function (VF) with an active Virtual
Machine (VM) is bound to it. Doing so will cause the port to appear to hang.
Once the VM shuts down, or otherwise releases the VF, the command will complete.

251 252 253 254 255 256 257 258 259 260 261 262 263 264 265
Using four traffic classes fails
--------------------------------
Do not try to reserve more than three traffic classes in the iavf driver. Doing
so will fail to set any traffic classes and will cause the driver to write
errors to stdout. Use a maximum of three queues to avoid this issue.

Multiple log error messages on iavf driver removal
--------------------------------------------------
If you have several VFs and you remove the iavf driver, several instances of
the following log errors are written to the log::

    Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err ok
    Unable to send the message to VF 2 aq_err 12
    ARQ Overflow Error detected

266 267 268 269 270 271
Virtual machine does not get link
---------------------------------
If the virtual machine has more than one virtual port assigned to it, and those
virtual ports are bound to different physical ports, you may not get link on
all of the virtual ports. The following command may work around the issue::

272
    # ethtool -r <PF>
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301

Where <PF> is the PF interface in the host, for example: p5p1. You may need to
run the command more than once to get link on all virtual ports.

MAC address of Virtual Function changes unexpectedly
----------------------------------------------------
If a Virtual Function's MAC address is not assigned in the host, then the VF
(virtual function) driver will use a random MAC address. This random MAC
address may change each time the VF driver is reloaded. You can assign a static
MAC address in the host machine. This static MAC address will survive
a VF driver reload.

Driver Buffer Overflow Fix
--------------------------
The fix to resolve CVE-2016-8105, referenced in Intel SA-00069
https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00069.html
is included in this and future versions of the driver.

Multiple Interfaces on Same Ethernet Broadcast Network
------------------------------------------------------
Due to the default ARP behavior on Linux, it is not possible to have one system
on two IP networks in the same Ethernet broadcast domain (non-partitioned
switch) behave as expected. All Ethernet interfaces will respond to IP traffic
for any IP address assigned to the system. This results in unbalanced receive
traffic.

If you have multiple interfaces in a server, either turn on ARP filtering by
entering::

302
    # echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
303

304 305 306
NOTE:
  This setting is not saved across reboots. The configuration change can be
  made permanent by adding the following line to the file /etc/sysctl.conf::
307

308
    net.ipv4.conf.all.arp_filter = 1
309 310 311 312 313 314 315 316 317

Another alternative is to install the interfaces in separate broadcast domains
(either in different switches or in a switch partitioned to VLANs).

Rx Page Allocation Errors
-------------------------
'Page allocation failure. order:0' errors may occur under stress.
This is caused by the way the Linux kernel reports this stressed condition.

318 319 320 321 322

Support
=======
For general information, go to the Intel support website at:

323
https://support.intel.com
324 325 326

or the Intel Wired Networking project hosted by Sourceforge at:

327
https://sourceforge.net/projects/e1000
328

329 330 331
If an issue is identified with the released source code on the supported kernel
with a supported adapter, email the specific information related to the issue
to e1000-devel@lists.sf.net