- 09 4月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 02 4月, 2019 4 次提交
-
-
由 Leon Romanovsky 提交于
port_pd is treated as le32 in declaration and read, fix assignment to be in le32 too. This change fixes the following compilation warnings. drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: warning: incorrect type in assignment (different base types) drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: expected restricted __le32 [usertype] port_pd drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: got restricted __be32 [usertype] Fixes: 9a443537 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NGal Pressman <galpress@amazon.com> Reviewed-by: NLijun Ou <ouliun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Shamir Rabinovitch 提交于
Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Shamir Rabinovitch 提交于
Now that we have the udata passed to all the ib_xxx object destroy APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Shamir Rabinovitch 提交于
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 30 3月, 2019 2 次提交
-
-
由 Matthew Wilcox 提交于
Also fully initialise the qp before storing it in the XArray. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Matthew Wilcox 提交于
Change the locking to not disable interrupts as the lookup in interrupt context will not see a freed CQ, thanks to the synchronize_irq() call before freeing the CQ. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 28 3月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
The forgotten static keyword causes to the following error to appear while building HNS driver. Declare hns_roce_cmq_send() to be static function to fix this warning. drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1089:5: warning: no previous prototype for _hns_roce_cmq_send_ [-Wmissing-prototypes] int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, Fixes: 6a04aed6 ("RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during reset") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 26 3月, 2019 7 次提交
-
-
由 Lijun Ou 提交于
The src_mac array is not used in hns_roce_v2_modify_qp function. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
According to IB protocol, the send with invalidate operation will not invalidate mr that was created through a register mr or reregister mr. Fixes: e93df010 ("RDMA/hns: Support local invalidate for hip08 in kernel space") Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
The driver should not print the error information when the hip08 driver not support virtual function. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
According to IB protocol, some fields of qp context are filled with optional when the relatived attr_mask are set. The relatived attr_mask include IB_QP_TIMEOUT, IB_QP_RETRY_CNT, IB_QP_RNR_RETRY and IB_QP_MIN_RNR_TIMER. Besides, we move some assignments of the fields of qp context into the outside of the specific qp state jump function. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
According to hip08 UM(User Manual), the raq_psn field size is [23:0]. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
Only when the IB_QP_RQ_PSN flags of attr_mask is set is it valid to assign the relatived fields of rq'psn into the qp context when modified qp. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
Only when the IB_QP_SQ_PSN flags of attr_mask is set is it valid to assign the relatived fields of psn into the qp context when modified qp. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 05 3月, 2019 1 次提交
-
-
由 YueHaibing 提交于
The the below commit, hns_roce_v2_modify_qp is called inside spinlock while using GFP_KERNEL. Change it to GFP_ATOMIC. Fixes: 0425e3e6 ("RDMA/hns: Support flush cqe for hip08 in kernel space") Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 23 2月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 20 2月, 2019 2 次提交
-
-
由 Yangyang Li 提交于
The method of set hem for scc context is different from other contexts. It should notify the hardware with the detailed idx in bt0 for scc, while for other contexts, it only need to notify the bt step and the hardware will calculate the idx. Here fixes the following error when unloading the hip08 driver: [ 123.570768] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 [ 123.579023] {1}[Hardware Error]: event severity: recoverable [ 123.584670] {1}[Hardware Error]: Error 0, type: recoverable [ 123.590317] {1}[Hardware Error]: section_type: PCIe error [ 123.595877] {1}[Hardware Error]: version: 4.0 [ 123.600395] {1}[Hardware Error]: command: 0x0006, status: 0x0010 [ 123.606562] {1}[Hardware Error]: device_id: 0000:7d:00.0 [ 123.612034] {1}[Hardware Error]: slot: 0 [ 123.616120] {1}[Hardware Error]: secondary_bus: 0x00 [ 123.621245] {1}[Hardware Error]: vendor_id: 0x19e5, device_id: 0xa222 [ 123.627847] {1}[Hardware Error]: class_code: 000002 [ 123.632977] hns3 0000:7d:00.0: aer_status: 0x00000000, aer_mask: 0x00000000 [ 123.639928] hns3 0000:7d:00.0: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 123.647400] hns3 0000:7d:00.0: aer_uncor_severity: 0x00000000 [ 123.653136] hns3 0000:7d:00.0: PCI error detected, state(=1)!! [ 123.658959] hns3 0000:7d:00.0: ROCEE uncorrected RAS error identified [ 123.665395] hns3 0000:7d:00.0: ROCEE RAS AXI rresp error [ 123.670713] hns3 0000:7d:00.0: requesting reset due to PCI error [ 123.676715] hns3 0000:7d:00.0: received reset event , reset type is 5 [ 123.683147] hns3 0000:7d:00.0: AER: Device recovery successful [ 123.688978] hns3 0000:7d:00.0: PF Reset requested [ 123.693684] hns3 0000:7d:00.0: PF failed(=-5) to send mailbox message to VF [ 123.700633] hns3 0000:7d:00.0: inform reset to vf(1) failded -5! Fixes: 6a157f7d ("RDMA/hns: Add SCC context allocation support for hip08") Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYixian Liu <liuyixian@huawei.com> Reviewed-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
Accroding to hip08's limitation, qp&cq specification is 1M, mtpt specification 1M in kernel space. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 16 2月, 2019 1 次提交
-
-
由 Shamir Rabinovitch 提交于
Now when we have the udata passed to all the ib_xxx object creation APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 15 2月, 2019 5 次提交
-
-
由 Lijun Ou 提交于
This patch adds new device capability for IB_DEVICE_MEM_MGT_EXTENSIONS to indicate device support for the following features: 1. Fast register memory region. 2. send with remote invalidate by frmr 3. local invalidate memory regsion As well as adds the max depth of frmr page list len. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yixian Liu 提交于
Current all messages printed for aeq subtype event are wrong. Thus, delete them and only the value of subtype event is printed. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yixian Liu 提交于
The memory allocated for wrid should be initialized to zero. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yixian Liu 提交于
The state of mr after reregister operation should be set to valid state. Otherwise, it will keep the same as the state before reregistered. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 chenglang 提交于
This patch modifies the minimum CQ depth specification of hip08 and is consistent with the processing of hip06. Signed-off-by: Nchenglang <chenglang@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 14 2月, 2019 1 次提交
-
-
由 Colin Ian King 提交于
The null check on an allocation failure on pd is currently checking if pd is non-null rather than null. Fix this by adding the missing ! operator. Fixes: 21a428a0 ("RDMA: Handle PD allocations by IB/core") Signed-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 12 2月, 2019 1 次提交
-
-
由 Shiraz, Saleem 提交于
Use the for_each_sg_dma_page iterator variant to walk the umem DMA-mapped SGL and get the page DMA address. This avoids the extra loop to iterate pages in the SGE when for_each_sg iterator is used. Additionally, purge umem->page_shift usage in the driver as its only relevant for ODP MRs. Use system page size and shift instead. Signed-off-by: NShiraz, Saleem <shiraz.saleem@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 09 2月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
The PD allocations in IB/core allows us to simplify drivers and their error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand in had with relevant update in .dealloc_pd(). We will use this opportunity and convert .dealloc_pd() to don't fail, as it was suggested a long time ago, failures are not happening as we have never seen a WARN_ON print. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 05 2月, 2019 4 次提交
-
-
由 Wei Hu (Xavier) 提交于
On hi08 chip, There is a possibility of chip hanging when sending doorbell during reset. We can fix it by prohibiting doorbell during reset. Fixes: 2d407888 ("RDMA/hns: Add support for processing send wr and receive wr") Signed-off-by: NWei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Wei Hu (Xavier) 提交于
On hi08 chip, There is a possibility of chip hanging and some errors when sending mailbox & doorbell during reset. We can fix it by prohibiting mailbox and doorbell during reset and reset occurred to ensure that hardware can work normally. Fixes: a04ff739 ("RDMA/hns: Add command queue support for hip08 RoCE driver") Signed-off-by: NWei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Wei Hu (Xavier) 提交于
In the reset process, the hns3 NIC driver notifies the RoCE driver to perform reset related processing by calling the .reset_notify() interface registered by the RoCE driver in hip08 SoC. In the current version, if a reset occurs simultaneously during the execution of rmmod or insmod ko, there may be Oops error as below: Internal error: Oops: 86000007 [#1] PREEMPT SMP Modules linked in: hns_roce(O) hns3(O) hclge(O) hnae3(O) [last unloaded: hns_roce_hw_v2] CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G O 4.19.0-ge00d540 #1 Hardware name: Huawei Technologies Co., Ltd. Workqueue: events hclge_reset_service_task [hclge] pstate: 60c00009 (nZCv daif +PAN +UAO) pc : 0xffff00000100b0b8 lr : 0xffff00000100aea0 sp : ffff000009afbab0 x29: ffff000009afbab0 x28: 0000000000000800 x27: 0000000000007ff0 x26: ffff80002f90c004 x25: 00000000000007ff x24: ffff000008f97000 x23: ffff80003efee0a8 x22: 0000000000001000 x21: ffff80002f917ff0 x20: ffff8000286ea070 x19: 0000000000000800 x18: 0000000000000400 x17: 00000000c4d3225d x16: 00000000000021b8 x15: 0000000000000400 x14: 0000000000000400 x13: 0000000000000000 x12: ffff80003fac6e30 x11: 0000800036303000 x10: 0000000000000001 x9 : 0000000000000000 x8 : ffff80003016d000 x7 : 0000000000000000 x6 : 000000000000003f x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000004 x2 : 00000000000007ff x1 : 0000000000000000 x0 : 0000000000000000 Process kworker/0:1 (pid: 14, stack limit = 0x00000000af8f0ad9) Call trace: 0xffff00000100b0b8 0xffff00000100b3a0 hns_roce_init+0x624/0xc88 [hns_roce] 0xffff000001002df8 0xffff000001006960 hclge_notify_roce_client+0x74/0xe0 [hclge] hclge_reset_service_task+0xa58/0xbc0 [hclge] process_one_work+0x1e4/0x458 worker_thread+0x40/0x450 kthread+0x12c/0x130 ret_from_fork+0x10/0x18 Code: bad PC value In the reset process, we will release the resources firstly, and after the hardware reset is completed, we will reapply resources and reconfigure the hardware. We can solve this problem by modifying both the NIC and the RoCE driver. We can modify the concurrent processing in the NIC driver to avoid calling the .reset_notify and .uninit_instance ops at the same time. And we need to modify the RoCE driver to record the reset stage and the driver's init/uninit state, and check the state in the .reset_notify, .init_instance. and uninit_instance functions to avoid NULL pointer operation. Fixes: cb7a94c9 ("RDMA/hns: Add reset process for RoCE in hip08") Signed-off-by: NWei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 YueHaibing 提交于
Fixes the following sparse warnings: drivers/infiniband/hw/hns/hns_roce_hw_v2.c:5822:5: warning: symbol 'hns_roce_v2_query_srq' was not declared. Should it be static? drivers/infiniband/hw/hns/hns_roce_srq.c:158:6: warning: symbol 'hns_roce_srq_free' was not declared. Should it be static? drivers/infiniband/hw/hns/hns_roce_srq.c:81:5: warning: symbol 'hns_roce_srq_alloc' was not declared. Should it be static? Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 01 2月, 2019 1 次提交
-
-
由 YueHaibing 提交于
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'hns_roce_v2_qp_flow_control_init': drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4384:33: warning: variable 'rst' set but not used [-Wunused-but-set-variable] It never used since introduction. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 31 1月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
All callers to ib_alloc_device() provide a larger size than struct ib_device and rely on the fact that struct ib_device is embedded in their driver specific structure as the first member. Provide a safer variant of ib_alloc_device() that checks and enforces this approach to make sure the drivers are using it right. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 26 1月, 2019 2 次提交
-
-
由 Masahiro Yamada 提交于
Currently, the Kbuild core manipulates header search paths in a crazy way [1]. To fix this mess, I want all Makefiles to add explicit $(srctree)/ to the search paths in the srctree. Some Makefiles are already written in that way, but not all. The goal of this work is to make the notation consistent, and finally get rid of the gross hacks. Having whitespaces after -I does not matter since commit 48f6e3cf ("kbuild: do not drop -I without parameter"). [1]: https://patchwork.kernel.org/patch/9632347/Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Acked-by: NParvi Kaustubhi <pkaustub@cisco.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
The hns_roce_ib_create_srq_resp is used to interact with the user for data, this was open coded to use a u32 directly, instead use a properly sized structure. Fixes: c7bcb134 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 25 1月, 2019 3 次提交
-
-
由 Yangyang Li 提交于
This patch adds qpc timer and cqc timer allocation support for hardware timeout retransmission in kernel space driver. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yangyang Li 提交于
This patch adds SCC context clear support for DCQCN in kernel space driver. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yangyang Li 提交于
This patch adds SCC context allocation and initialization support for DCQCN in kernel space driver. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 22 1月, 2019 1 次提交
-
-
由 Xiaofei Tan 提交于
AEQ overflow will be reported by hardware when too many asynchronous events occurred but not be handled in time. Normally, AEQ overflow error is not easy to occur. Once happened, we have to do physical function reset to recover. PF reset is implemented in two steps. Firstly, set reset level with ae_dev->ops->set_default_reset_request. Secondly, run reset with ae_dev->ops->reset_event. Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-