提交 cae22979 编写于 作者: C Chengchang Tang 提交者: ZhouJuan

RDMA/hns: Fix base address table allocation

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I76PUJ

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=7f3969b14f35

---------------------------------------------------------------

For hns, the specification of an entry like resource (E.g. WQE/CQE/EQE)
depends on BT page size, buf page size and hopnum. For user mode, the buf
page size depends on UMEM. Therefore, the actual specification is
controlled by BT page size and hopnum.

The current BT page size and hopnum are obtained from firmware. This makes
the driver inflexible and introduces unnecessary constraints.  Resource
allocation failures occur in many scenarios.

This patch will calculate whether the BT page size set by firmware is
sufficient before allocating BT, and increase the BT page size if it is
insufficient.

Fixes: 11334014 ("RDMA/hns: Optimize base address table config flow for qp buffer")
Link: https://lore.kernel.org/r/20230512092245.344442-3-huangjunxian6@hisilicon.comSigned-off-by: NChengchang Tang <tangchengchang@huawei.com>
Signed-off-by: NJunxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NZhou Juan <nnuzj07170227@163.com>
上级 ca69b2a2
......@@ -34,6 +34,7 @@
#include <linux/vmalloc.h>
#include <linux/count_zeros.h>
#include <rdma/ib_umem.h>
#include <linux/bitops.h>
#include "hns_roce_device.h"
#include "hns_roce_cmd.h"
#include "hns_roce_hem.h"
......@@ -1070,6 +1071,44 @@ static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev,
return 0;
}
static u64 cal_pages_per_l1ba(unsigned int ba_per_bt, unsigned int hopnum)
{
return int_pow(ba_per_bt, hopnum - 1);
}
static unsigned int cal_best_bt_pg_sz(struct hns_roce_dev *hr_dev,
struct hns_roce_mtr *mtr,
unsigned int pg_shift)
{
unsigned long cap = hr_dev->caps.page_size_cap;
struct hns_roce_buf_region *re;
unsigned int pgs_per_l1ba;
unsigned int ba_per_bt;
unsigned int ba_num;
int i;
for_each_set_bit_from(pg_shift, &cap, sizeof(cap) * BITS_PER_BYTE) {
if (!(BIT(pg_shift) & cap))
continue;
ba_per_bt = BIT(pg_shift) / BA_BYTE_LEN;
ba_num = 0;
for (i = 0; i < mtr->hem_cfg.region_count; i++) {
re = &mtr->hem_cfg.region[i];
if (re->hopnum == 0)
continue;
pgs_per_l1ba = cal_pages_per_l1ba(ba_per_bt, re->hopnum);
ba_num += DIV_ROUND_UP(re->count, pgs_per_l1ba);
}
if (ba_num <= ba_per_bt)
return pg_shift;
}
return 0;
}
static int mtr_alloc_mtt(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
unsigned int ba_page_shift)
{
......@@ -1078,6 +1117,10 @@ static int mtr_alloc_mtt(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
hns_roce_hem_list_init(&mtr->hem_list);
if (!cfg->is_direct) {
ba_page_shift = cal_best_bt_pg_sz(hr_dev, mtr, ba_page_shift);
if (!ba_page_shift)
return -ERANGE;
ret = hns_roce_hem_list_request(hr_dev, &mtr->hem_list,
cfg->region, cfg->region_count,
ba_page_shift);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册