提交 · 3a6863e4e8ee212c7f86594299d9ff0d6a15ecbc · openeuler / Kernel

22 11月, 2020 3 次提交

net: hns3: add support for pf querying new interrupt resources · 3a6863e4

由 Yufeng Mo 提交于 11月 20, 2020

For HNAE3_DEVICE_VERSION_V3, a maximum of 1281 interrupt
resources are supported. To utilize these new resources,
extend the corresponding field or variable to 16bit type,
and remove the restriction of NIC client that only use a
maximum of 65 interrupt vectors. In addition, the I/O address
of the extended interrupt resources are different, so an extra
handler is needed.

Currently, the total number of interrupts is the sum of RoCE's
number and RoCE's offset (RoCE is in front of NIC), since
the number of both NIC and RoCE are same. For readability,
rewrite the corresponding field of the command, rename the
RoCE's offset field as the number of NIC interrupts, then
the total number of interrupts is sum of the number of RoCE
and NIC, and replace vport->back with hdev in
hclge_init_roce_base_info() for simplifying the code.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3a6863e4

net: hns3: add support for mapping device memory · 30ae7f8a

由 Huazhong Tan 提交于 11月 20, 2020

For device who has device memory accessed through the PCI BAR4,
IO descriptor push of NIC and direct WQE(Work Queue Element) of
RoCE will use this device memory, so add support for mapping
this device memory, and add this info to the RoCE client whose
new feature needs.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

30ae7f8a

net: hns3: add support for 1280 queues · 9a5ef4aa

由 Yonglong Liu 提交于 11月 20, 2020

For DEVICE_VERSION_V1/2, there are total 1024 queues and
queue sets. For DEVICE_VERSION_V3, it increases to 1280,
and can be assigned to one pf， so remove the limitation
of 1024.

To keep compatible with DEVICE_VERSION_V1/2 and old driver
version, the queue number is split into two part:
tqp_num(range 0~1023) and ext_tqp_num(range 1024~1279).
Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

9a5ef4aa

21 11月, 2020 37 次提交

Merge branch 'ibmvnic-performance-improvements-and-other-updates' · 16de5970

由 Jakub Kicinski 提交于 11月 20, 2020

Thomas Falcon says:

====================
ibmvnic: Performance improvements and other updates

The first three patches utilize a hypervisor call allowing multiple
TX and RX buffer replenishment descriptors to be sent in one operation,
which significantly reduces hypervisor call overhead. The xmit_more
and Byte Queue Limit API's are leveraged to provide this support
for TX descriptors.

The subsequent two patches remove superfluous code and members in
TX completion handling function and TX buffer structure, respectively,
and remove unused routines.

Finally, four patches which ensure that device queue memory is
cache-line aligned, resolving slowdowns observed in PCI traces,
as well as optimize the driver's NAPI polling function and
to RX buffer replenishment are provided by Dwip Banerjee.

This series provides significant performance improvements, allowing
the driver to fully utilize 100Gb NIC's.
====================

Link: https://lore.kernel.org/r/1605748345-32062-1-git-send-email-tlfalcon@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

16de5970

ibmvnic: Do not replenish RX buffers after every polling loop · 41ed0a00

由 Dwip N. Banerjee 提交于 11月 18, 2020

Reduce the amount of time spent replenishing RX buffers by only doing
so once available buffers has fallen under a certain threshold, in this
case half of the total number of buffers, or if the polling loop exits
before the packets processed is less than its budget. Non-exhaustion of
NAPI budget implies lower incoming packet pressure, allowing the leeway
to refill the buffers in preparation for any impending burst.
Signed-off-by: NDwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

41ed0a00

ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers · e552aa31

由 Dwip N. Banerjee 提交于 11月 18, 2020

Take advantage of the additional optimizations in netdev_alloc_skb when
allocating socket buffers to be used for packet reception.
Signed-off-by: NDwip N. Banerjee <dnbanerg@us.ibm.com>
Acked-by: NLijun Pan <ljp@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e552aa31

ibmvnic: Correctly re-enable interrupts in NAPI polling routine · ec20f36b

由 Dwip N. Banerjee 提交于 11月 18, 2020

If the current NAPI polling loop exits without completing it's
budget, only re-enable interrupts if there are no entries remaining
in the queue and napi_complete_done is successful. If there are entries
remaining on the queue that were missed, restart the polling loop.
Signed-off-by: NDwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ec20f36b

ibmvnic: Ensure that device queue memory is cache-line aligned · 9a87c3fc

由 Dwip N. Banerjee 提交于 11月 18, 2020

PCI bus slowdowns were observed on IBM VNIC devices as a result
of partial cache line writes and non-cache aligned full cache line writes.
Ensure that packet data buffers are cache-line aligned to avoid these
slowdowns.
Signed-off-by: NDwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

9a87c3fc

ibmvnic: Remove send_subcrq function · 8ed589f3