net/mlx5e: Optimize poll ICOSQ completion queue
UMR operations are more frequent and important. Check them first, and add a compiler branch predictor hint. According to current design, ICOSQ CQ can contain at most one pending CQE per napi. Poll function is optimized accordingly. Performance: Single-stream packet-rate tested with pktgen. Packets are dropped in tc level to zoom into driver data-path. Larger gain is expected for larger packet sizes, as BW is higher and UMR posts are more frequent. --------------------------------------------- packet size | before | after | gain | 64B | 4,092,370 | 4,113,306 | 0.5% | 1024B | 3,421,435 | 3,633,819 | 6.2% | Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Showing
想要评论请 注册 或 登录