提交 ba82427d 编写于 作者: D David S. Miller

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-03-23

This series contains updates to i40e and i40e.txt documentation.

Jake provides all the changes in the series which are centered around
ntuple filter fixes and additional support.  Fixed the current
implementation of .set_rxnfc, where we were not reading the mask field
for filter entries which was resulting in filters not behaving as
expected and not working correctly.  When cleaning up after disabling
flow director support, ensure that the default input set is correctly
reprogrammed.  Since the hardware only supports a single input set for
all flows of that type, the driver shall only allow the input set to
change if there are no other configured filters for that flow type, so
add support to detect when we can update the input set for each flow
type.  Align the driver to other drivers to partition the ring_cookie
value into 8bits of VF index, along with 32bits of queue number instead
of using the user-def field.  Added support to parse the user-def field
into a data structure format to allow future extensions of the user-def
filed by keeping all the code that read/writes the field into a single
location.  Added support for flexible payloads passed via ethtool
user-def field.  We support a single flexible word (2byte) value per
protocol type, and we handle the FLX_PIT register using a list of
flexible entries so that each flow type may be configured separately.
Enabled flow director filters for SCTPv4 packets using the ethtool
ntuple interface to enable filters.  Updated the documentation on the
i40e driver to include the newly added support to ntuple filters.
Reduced complexity of a if-continue-else-break section of code by
taking advantage of using hlist_for_each_entry_continue() instead.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
......@@ -63,6 +63,78 @@ Additional Configurations
The latest release of ethtool can be found from
https://www.kernel.org/pub/software/network/ethtool
Flow Director n-ntuple traffic filters (FDir)
---------------------------------------------
The driver utilizes the ethtool interface for configuring ntuple filters,
via "ethtool -N <device> <filter>".
The sctp4, ip4, udp4, and tcp4 flow types are supported with the standard
fields including src-ip, dst-ip, src-port and dst-port. The driver only
supports fully enabling or fully masking the fields, so use of the mask
fields for partial matches is not supported.
Additionally, the driver supports using the action to specify filters for a
Virtual Function. You can specify the action as a 64bit value, where the
lower 32 bits represents the queue number, while the next 8 bits represent
which VF. Note that 0 is the PF, so the VF identifier is offset by 1. For
example:
... action 0x800000002 ...
Would indicate to direct traffic for Virtual Function 7 (8 minus 1) on queue
2 of that VF.
The driver also supports using the user-defined field to specify 2 bytes of
arbitrary data to match within the packet payload in addition to the regular
fields. The data is specified in the lower 32bits of the user-def field in
the following way:
+----------------------------+---------------------------+
| 31 28 24 20 16 | 15 12 8 4 0|
+----------------------------+---------------------------+
| offset into packet payload | 2 bytes of flexible data |
+----------------------------+---------------------------+
As an example,
... user-def 0x4FFFF ....
means to match the value 0xFFFF 4 bytes into the packet payload. Note that
the offset is based on the beginning of the payload, and not the beginning
of the packet. Thus
flow-type tcp4 ... user-def 0x8BEAF ....
would match TCP/IPv4 packets which have the value 0xBEAF 8bytes into the
TCP/IPv4 payload.
For ICMP, the hardware parses the ICMP header as 4 bytes of header and 4
bytes of payload, so if you want to match an ICMP frames payload you may need
to add 4 to the offset in order to match the data.
Furthermore, the offset can only be up to a value of 64, as the hardware
will only read up to 64 bytes of data from the payload. It must also be even
as the flexible data is 2 bytes long and must be aligned to byte 0 of the
packet payload.
When programming filters, the hardware is limited to using a single input
set for each flow type. This means that it is an error to program two
different filters with the same type that don't match on the same fields.
Thus the second of the following two commands will fail:
ethtool -N <device> flow-type tcp4 src-ip 192.168.0.7 action 5
ethtool -N <device> flow-type tcp4 dst-ip 192.168.15.18 action 1
This is because the first filter will be accepted and reprogram the input
set for TCPv4 filters, but the second filter will be unable to reprogram the
input set until all the conflicting TCPv4 filters are first removed.
Note that the user-defined flexible offset is also considered part of the
input set and cannot be programmed separately for multiple filters of the
same type. However, the flexible data is not part of the input set and
multiple filters may use the same offset but match against different data.
Data Center Bridging (DCB)
--------------------------
DCB configuration is not currently supported.
......
......@@ -202,6 +202,15 @@ enum i40e_fd_stat_idx {
#define I40E_FD_ATR_TUNNEL_STAT_IDX(pf_id) \
(I40E_FD_STAT_PF_IDX(pf_id) + I40E_FD_STAT_ATR_TUNNEL)
/* The following structure contains the data parsed from the user-defined
* field of the ethtool_rx_flow_spec structure.
*/
struct i40e_rx_flow_userdef {
bool flex_filter;
u16 flex_word;
u16 flex_offset;
};
struct i40e_fdir_filter {
struct hlist_node fdir_node;
/* filter ipnut set */
......@@ -213,6 +222,12 @@ struct i40e_fdir_filter {
__be16 src_port;
__be16 dst_port;
__be32 sctp_v_tag;
/* Flexible data to match within the packet payload */
__be16 flex_word;
u16 flex_offset;
bool flex_filter;
/* filter control */
u16 q_index;
u8 flex_off;
......@@ -249,6 +264,75 @@ struct i40e_udp_port_config {
u8 type;
};
/* macros related to FLX_PIT */
#define I40E_FLEX_SET_FSIZE(fsize) (((fsize) << \
I40E_PRTQF_FLX_PIT_FSIZE_SHIFT) & \
I40E_PRTQF_FLX_PIT_FSIZE_MASK)
#define I40E_FLEX_SET_DST_WORD(dst) (((dst) << \
I40E_PRTQF_FLX_PIT_DEST_OFF_SHIFT) & \
I40E_PRTQF_FLX_PIT_DEST_OFF_MASK)
#define I40E_FLEX_SET_SRC_WORD(src) (((src) << \
I40E_PRTQF_FLX_PIT_SOURCE_OFF_SHIFT) & \
I40E_PRTQF_FLX_PIT_SOURCE_OFF_MASK)
#define I40E_FLEX_PREP_VAL(dst, fsize, src) (I40E_FLEX_SET_DST_WORD(dst) | \
I40E_FLEX_SET_FSIZE(fsize) | \
I40E_FLEX_SET_SRC_WORD(src))
#define I40E_FLEX_PIT_GET_SRC(flex) (((flex) & \
I40E_PRTQF_FLX_PIT_SOURCE_OFF_MASK) >> \
I40E_PRTQF_FLX_PIT_SOURCE_OFF_SHIFT)
#define I40E_FLEX_PIT_GET_DST(flex) (((flex) & \
I40E_PRTQF_FLX_PIT_DEST_OFF_MASK) >> \
I40E_PRTQF_FLX_PIT_DEST_OFF_SHIFT)
#define I40E_FLEX_PIT_GET_FSIZE(flex) (((flex) & \
I40E_PRTQF_FLX_PIT_FSIZE_MASK) >> \
I40E_PRTQF_FLX_PIT_FSIZE_SHIFT)
#define I40E_MAX_FLEX_SRC_OFFSET 0x1F
/* macros related to GLQF_ORT */
#define I40E_ORT_SET_IDX(idx) (((idx) << \
I40E_GLQF_ORT_PIT_INDX_SHIFT) & \
I40E_GLQF_ORT_PIT_INDX_MASK)
#define I40E_ORT_SET_COUNT(count) (((count) << \
I40E_GLQF_ORT_FIELD_CNT_SHIFT) & \
I40E_GLQF_ORT_FIELD_CNT_MASK)
#define I40E_ORT_SET_PAYLOAD(payload) (((payload) << \
I40E_GLQF_ORT_FLX_PAYLOAD_SHIFT) & \
I40E_GLQF_ORT_FLX_PAYLOAD_MASK)
#define I40E_ORT_PREP_VAL(idx, count, payload) (I40E_ORT_SET_IDX(idx) | \
I40E_ORT_SET_COUNT(count) | \
I40E_ORT_SET_PAYLOAD(payload))
#define I40E_L3_GLQF_ORT_IDX 34
#define I40E_L4_GLQF_ORT_IDX 35
/* Flex PIT register index */
#define I40E_FLEX_PIT_IDX_START_L2 0
#define I40E_FLEX_PIT_IDX_START_L3 3
#define I40E_FLEX_PIT_IDX_START_L4 6
#define I40E_FLEX_PIT_TABLE_SIZE 3
#define I40E_FLEX_DEST_UNUSED 63
#define I40E_FLEX_INDEX_ENTRIES 8
/* Flex MASK to disable all flexible entries */
#define I40E_FLEX_INPUT_MASK (I40E_FLEX_50_MASK | I40E_FLEX_51_MASK | \
I40E_FLEX_52_MASK | I40E_FLEX_53_MASK | \
I40E_FLEX_54_MASK | I40E_FLEX_55_MASK | \
I40E_FLEX_56_MASK | I40E_FLEX_57_MASK)
struct i40e_flex_pit {
struct list_head list;
u16 src_offset;
u8 pit_index;
};
/* struct that defines the Ethernet device */
struct i40e_pf {
struct pci_dev *pdev;
......@@ -293,8 +377,17 @@ struct i40e_pf {
*/
u16 fd_tcp4_filter_cnt;
u16 fd_udp4_filter_cnt;
u16 fd_sctp4_filter_cnt;
u16 fd_ip4_filter_cnt;
/* Flexible filter table values that need to be programmed into
* hardware, which expects L3 and L4 to be programmed separately. We
* need to ensure that the values are in ascended order and don't have
* duplicates, so we track each L3 and L4 values in separate lists.
*/
struct list_head l3_flex_pit_list;
struct list_head l4_flex_pit_list;
struct i40e_udp_port_config udp_ports[I40E_MAX_PF_UDP_OFFLOAD_PORTS];
u16 pending_udp_bitmap;
......@@ -734,6 +827,43 @@ static inline int i40e_get_fd_cnt_all(struct i40e_pf *pf)
return pf->hw.fdir_shared_filter_count + pf->fdir_pf_filter_count;
}
/**
* i40e_read_fd_input_set - reads value of flow director input set register
* @pf: pointer to the PF struct
* @addr: register addr
*
* This function reads value of flow director input set register
* specified by 'addr' (which is specific to flow-type)
**/
static inline u64 i40e_read_fd_input_set(struct i40e_pf *pf, u16 addr)
{
u64 val;
val = i40e_read_rx_ctl(&pf->hw, I40E_PRTQF_FD_INSET(addr, 1));
val <<= 32;
val += i40e_read_rx_ctl(&pf->hw, I40E_PRTQF_FD_INSET(addr, 0));
return val;
}
/**
* i40e_write_fd_input_set - writes value into flow director input set register
* @pf: pointer to the PF struct
* @addr: register addr
* @val: value to be written
*
* This function writes specified value to the register specified by 'addr'.
* This register is input set register based on flow-type.
**/
static inline void i40e_write_fd_input_set(struct i40e_pf *pf,
u16 addr, u64 val)
{
i40e_write_rx_ctl(&pf->hw, I40E_PRTQF_FD_INSET(addr, 1),
(u32)(val >> 32));
i40e_write_rx_ctl(&pf->hw, I40E_PRTQF_FD_INSET(addr, 0),
(u32)(val & 0xFFFFFFFFULL));
}
/* needed by i40e_ethtool.c */
int i40e_up(struct i40e_vsi *vsi);
void i40e_down(struct i40e_vsi *vsi);
......
......@@ -1883,19 +1883,12 @@ static void i40e_undo_add_filter_entries(struct i40e_vsi *vsi,
static
struct i40e_new_mac_filter *i40e_next_filter(struct i40e_new_mac_filter *next)
{
while (next) {
next = hlist_entry(next->hlist.next,
typeof(struct i40e_new_mac_filter),
hlist);
/* keep going if we found a broadcast filter */
if (next && is_broadcast_ether_addr(next->f->macaddr))
continue;
break;
hlist_for_each_entry_continue(next, hlist) {
if (!is_broadcast_ether_addr(next->f->macaddr))
return next;
}
return next;
return NULL;
}
/**
......@@ -3286,6 +3279,7 @@ static void i40e_fdir_filter_restore(struct i40e_vsi *vsi)
/* Reset FDir counters as we're replaying all existing filters */
pf->fd_tcp4_filter_cnt = 0;
pf->fd_udp4_filter_cnt = 0;
pf->fd_sctp4_filter_cnt = 0;
pf->fd_ip4_filter_cnt = 0;
hlist_for_each_entry_safe(filter, node,
......@@ -5747,6 +5741,7 @@ int i40e_vsi_open(struct i40e_vsi *vsi)
static void i40e_fdir_filter_exit(struct i40e_pf *pf)
{
struct i40e_fdir_filter *filter;
struct i40e_flex_pit *pit_entry, *tmp;
struct hlist_node *node2;
hlist_for_each_entry_safe(filter, node2,
......@@ -5755,10 +5750,42 @@ static void i40e_fdir_filter_exit(struct i40e_pf *pf)
kfree(filter);
}
list_for_each_entry_safe(pit_entry, tmp, &pf->l3_flex_pit_list, list) {
list_del(&pit_entry->list);
kfree(pit_entry);
}
INIT_LIST_HEAD(&pf->l3_flex_pit_list);
list_for_each_entry_safe(pit_entry, tmp, &pf->l4_flex_pit_list, list) {
list_del(&pit_entry->list);
kfree(pit_entry);
}
INIT_LIST_HEAD(&pf->l4_flex_pit_list);
pf->fdir_pf_active_filters = 0;
pf->fd_tcp4_filter_cnt = 0;
pf->fd_udp4_filter_cnt = 0;
pf->fd_sctp4_filter_cnt = 0;
pf->fd_ip4_filter_cnt = 0;
/* Reprogram the default input set for TCP/IPv4 */
i40e_write_fd_input_set(pf, I40E_FILTER_PCTYPE_NONF_IPV4_TCP,
I40E_L3_SRC_MASK | I40E_L3_DST_MASK |
I40E_L4_SRC_MASK | I40E_L4_DST_MASK);
/* Reprogram the default input set for UDP/IPv4 */
i40e_write_fd_input_set(pf, I40E_FILTER_PCTYPE_NONF_IPV4_UDP,
I40E_L3_SRC_MASK | I40E_L3_DST_MASK |
I40E_L4_SRC_MASK | I40E_L4_DST_MASK);
/* Reprogram the default input set for SCTP/IPv4 */
i40e_write_fd_input_set(pf, I40E_FILTER_PCTYPE_NONF_IPV4_SCTP,
I40E_L3_SRC_MASK | I40E_L3_DST_MASK |
I40E_L4_SRC_MASK | I40E_L4_DST_MASK);
/* Reprogram the default input set for Other/IPv4 */
i40e_write_fd_input_set(pf, I40E_FILTER_PCTYPE_NONF_IPV4_OTHER,
I40E_L3_SRC_MASK | I40E_L3_DST_MASK);
}
/**
......@@ -11125,6 +11152,9 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
hw->bus.bus_id = pdev->bus->number;
pf->instance = pfs_found;
INIT_LIST_HEAD(&pf->l3_flex_pit_list);
INIT_LIST_HEAD(&pf->l4_flex_pit_list);
/* set up the locks for the AQ, do this only once in probe
* and destroy them only once in remove
*/
......
......@@ -71,6 +71,9 @@ static void i40e_fdir(struct i40e_ring *tx_ring,
flex_ptype |= I40E_TXD_FLTR_QW0_PCTYPE_MASK &
(fdata->pctype << I40E_TXD_FLTR_QW0_PCTYPE_SHIFT);
flex_ptype |= I40E_TXD_FLTR_QW0_PCTYPE_MASK &
(fdata->flex_offset << I40E_TXD_FLTR_QW0_FLEXOFF_SHIFT);
/* Use LAN VSI Id if not programmed by user */
flex_ptype |= I40E_TXD_FLTR_QW0_DEST_VSI_MASK &
((u32)(fdata->dest_vsi ? : pf->vsi[pf->lan_vsi]->id) <<
......@@ -223,6 +226,14 @@ static int i40e_add_del_fdir_udpv4(struct i40e_vsi *vsi,
ip->saddr = fd_data->src_ip;
udp->source = fd_data->src_port;
if (fd_data->flex_filter) {
u8 *payload = raw_packet + I40E_UDPIP_DUMMY_PACKET_LEN;
__be16 pattern = fd_data->flex_word;
u16 off = fd_data->flex_offset;
*((__force __be16 *)(payload + off)) = pattern;
}
fd_data->pctype = I40E_FILTER_PCTYPE_NONF_IPV4_UDP;
ret = i40e_program_fdir_filter(fd_data, raw_packet, pf, add);
if (ret) {
......@@ -289,6 +300,14 @@ static int i40e_add_del_fdir_tcpv4(struct i40e_vsi *vsi,
ip->saddr = fd_data->src_ip;
tcp->source = fd_data->src_port;
if (fd_data->flex_filter) {
u8 *payload = raw_packet + I40E_TCPIP_DUMMY_PACKET_LEN;
__be16 pattern = fd_data->flex_word;
u16 off = fd_data->flex_offset;
*((__force __be16 *)(payload + off)) = pattern;
}
fd_data->pctype = I40E_FILTER_PCTYPE_NONF_IPV4_TCP;
ret = i40e_program_fdir_filter(fd_data, raw_packet, pf, add);
if (ret) {
......@@ -327,6 +346,80 @@ static int i40e_add_del_fdir_tcpv4(struct i40e_vsi *vsi,
return 0;
}
#define I40E_SCTPIP_DUMMY_PACKET_LEN 46
/**
* i40e_add_del_fdir_sctpv4 - Add/Remove SCTPv4 Flow Director filters for
* a specific flow spec
* @vsi: pointer to the targeted VSI
* @fd_data: the flow director data required for the FDir descriptor
* @add: true adds a filter, false removes it
*
* Returns 0 if the filters were successfully added or removed
**/
static int i40e_add_del_fdir_sctpv4(struct i40e_vsi *vsi,
struct i40e_fdir_filter *fd_data,
bool add)
{
struct i40e_pf *pf = vsi->back;
struct sctphdr *sctp;
struct iphdr *ip;
u8 *raw_packet;
int ret;
/* Dummy packet */
static char packet[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x08, 0,
0x45, 0, 0, 0x20, 0, 0, 0x40, 0, 0x40, 0x84, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
raw_packet = kzalloc(I40E_FDIR_MAX_RAW_PACKET_SIZE, GFP_KERNEL);
if (!raw_packet)
return -ENOMEM;
memcpy(raw_packet, packet, I40E_SCTPIP_DUMMY_PACKET_LEN);
ip = (struct iphdr *)(raw_packet + IP_HEADER_OFFSET);
sctp = (struct sctphdr *)(raw_packet + IP_HEADER_OFFSET
+ sizeof(struct iphdr));
ip->daddr = fd_data->dst_ip;
sctp->dest = fd_data->dst_port;
ip->saddr = fd_data->src_ip;
sctp->source = fd_data->src_port;
if (fd_data->flex_filter) {
u8 *payload = raw_packet + I40E_SCTPIP_DUMMY_PACKET_LEN;
__be16 pattern = fd_data->flex_word;
u16 off = fd_data->flex_offset;
*((__force __be16 *)(payload + off)) = pattern;
}
fd_data->pctype = I40E_FILTER_PCTYPE_NONF_IPV4_SCTP;
ret = i40e_program_fdir_filter(fd_data, raw_packet, pf, add);
if (ret) {
dev_info(&pf->pdev->dev,
"PCTYPE:%d, Filter command send failed for fd_id:%d (ret = %d)\n",
fd_data->pctype, fd_data->fd_id, ret);
/* Free the packet buffer since it wasn't added to the ring */
kfree(raw_packet);
return -EOPNOTSUPP;
} else if (I40E_DEBUG_FD & pf->hw.debug_mask) {
if (add)
dev_info(&pf->pdev->dev,
"Filter OK for PCTYPE %d loc = %d\n",
fd_data->pctype, fd_data->fd_id);
else
dev_info(&pf->pdev->dev,
"Filter deleted for PCTYPE %d loc = %d\n",
fd_data->pctype, fd_data->fd_id);
}
if (add)
pf->fd_sctp4_filter_cnt++;
else
pf->fd_sctp4_filter_cnt--;
return 0;
}
#define I40E_IP_DUMMY_PACKET_LEN 34
/**
* i40e_add_del_fdir_ipv4 - Add/Remove IPv4 Flow Director filters for
......@@ -362,6 +455,14 @@ static int i40e_add_del_fdir_ipv4(struct i40e_vsi *vsi,
ip->daddr = fd_data->dst_ip;
ip->protocol = 0;
if (fd_data->flex_filter) {
u8 *payload = raw_packet + I40E_IP_DUMMY_PACKET_LEN;
__be16 pattern = fd_data->flex_word;
u16 off = fd_data->flex_offset;
*((__force __be16 *)(payload + off)) = pattern;
}
fd_data->pctype = i;
ret = i40e_program_fdir_filter(fd_data, raw_packet, pf, add);
if (ret) {
......@@ -413,6 +514,9 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi,
case UDP_V4_FLOW:
ret = i40e_add_del_fdir_udpv4(vsi, input, add);
break;
case SCTP_V4_FLOW:
ret = i40e_add_del_fdir_sctpv4(vsi, input, add);
break;
case IP_USER_FLOW:
switch (input->ip4_proto) {
case IPPROTO_TCP:
......@@ -421,6 +525,9 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi,
case IPPROTO_UDP:
ret = i40e_add_del_fdir_udpv4(vsi, input, add);
break;
case IPPROTO_SCTP:
ret = i40e_add_del_fdir_sctpv4(vsi, input, add);
break;
case IPPROTO_IP:
ret = i40e_add_del_fdir_ipv4(vsi, input, add);
break;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册