- 20 7月, 2017 29 次提交
-
-
由 Kaike Wan 提交于
Current computation of qp->timeout_jiffies in rvt_modify_qp() will cause overflow due to the fact that the input to the function usecs_to_jiffies is only 32-bit ( unsigned int). Overflow will occur when attr->timeout is equal to or greater than 30. The consequence is unnecessarily excessive retry and thus degradation of the system performance. This patch fixes the problem by limiting the input to 5-bit and calling usecs_to_jiffies() before multiplying the scaling factor. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Matan Barak 提交于
Delete unused variables to prevent sparse warnings. Fixes: db1b5ddd ("IB/core: Rename uverbs event file structure") Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema") Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Selvin Xavier 提交于
Local ack delay exposed by the driver is 0 which means infinite QP timeout. Reporting the default value to 16 (approx 260ms) Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Selvin Xavier 提交于
While invoking the req_notify_cq hook, ULPs can request whether the CQs have any CQEs pending. If CQEs are pending, drivers can indicate it by returning 1 for req_notify_cq. The stack will poll CQ again till CQ is empty. This patch peeks the CQ for any valid entries and return accordingly. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Devesh Sharma 提交于
Fix the incorrect reporting of number of polled entries by taking into account the max CQ depth in the driver. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Devesh Sharma 提交于
Driver shall check if the host system bios has enabled Atomic operations capability in PCI Device Control 2 register of the pci-device. Expose the ATOMIC_HCA flag only if the Atomic operations capability is set. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Somnath Kotur 提交于
Starting FW version 20.6.47, firmware is keeping separate statistics for L2 and RDMA. However, driver needs to specify RDMA or not when allocating stat_ctx. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Eddie Wai 提交于
There's a couple of bugs in the support of max_rd_atomic and max_dest_rd_atomic. In the modify_qp, if the requested max_rd_atomic, which is the ORRQ size, is greater than what the chip can support, then we have to cap the request to chip max as we can't have the HW overflow the ORRQ. Capping the max_rd_atomic support internally is okay to do as the remaining read/atomic WRs will still be sitting in the SQ. However, for the max_dest_rd_atomic, the driver has to error out as this dictates the IRRQ size and we can't control what the remote side sends. Signed-off-by: NEddie Wai <eddie.wai@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Selvin Xavier 提交于
- Report supported value for max_mr_size to IB stack in query_device. Also, check and log if MR size requested by application in reg_user_mr() is greater than value currently supported by driver. - Report only 4K page size support for now - Fix Max_QP value returned by ibv_devinfo -vv. In case of PF, FW reserves 129 QPs for creating QP1s of VFs and PF. So the max_qp value reported by FW for PF doesn'tt include the QP1. Fixing this issue by adding 1 with the value reported by FW. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Selvin Xavier 提交于
This fix is added only to avoid system crash in some a specific scenario. When bnxt_re driver is loaded and if user tries to change interface mac address, delete GID fails because QP1 is still associated with existing MAC (default GID). If the above command fails GID tables are not modified in the h/w or driver, but the GID context memory is freed. Now, if the user changes the mac back to the original value, another add_gid comes to the driver where the driver reports that the GID is already present in its table and tries to access the context which was already freed. So, in this case, in order to avoid NULL pointer de-reference, this patch removes the context memory free if delete_gid fails and the same context memory is re-used in new add_gid. Memory cleanup will be taken care during driver unload, while deleting the GID table. Signed-off-by: NKalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Somnath Kotur 提交于
Posting WQE size of 2 results in a WQE_FORMAT_ERROR thrown by the HW as it requires host to supply WQE Size with room for atleast one SGE so that the resulting WQE size be atleast 3. Signed-off-by: NSomnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Devesh Sharma 提交于
The driver must free the DPI during the dealloc_ucontext instead of freeing it during dealloc_pd. However, the DPI allocation scheme remains unchanged. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
"umem" is a valid pointer. We intended to print "*umem" or even just "err" instead. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
If either of these allocations fail then we return ERR_PTR(0). That's equivalent to NULL and results in a NULL pointer dereference in the caller. Fixes: fe2caefc ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
We should preserve the original "status" error code instead of resetting it to zero. Returning ERR_PTR(0) is the same as NULL and results in a NULL dereference in the callers. I added a printk() on error instead. Fixes: 45e86b33 ("RDMA/ocrdma: Cache recv DB until QP moved to RTR") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
We accidentally don't set the error code on some error paths. It means return ERR_PTR(0) which is NULL and results in a NULL dereference in the caller. Fixes: 13a23933 ("RDMA/cxgb3: Don't ignore insert_handle() failures") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
If one of these kmalloc() calls fails then we return ERR_PTR(0) which is NULL. It results in a NULL dereference in the callers. Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
We accidentally forgot to set the error code if ib_copy_from_udata() fails. It means we return ERR_PTR(0) which is NULL and results in a NULL dereference in the callers. Fixes: d3749841 ("i40iw: add files for iwarp interface") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
We accidentally don't see the error code on some of these error paths. It means we return ERR_PTR(0) which is NULL and it results in a NULL dereference in the caller. This bug dates to pre-git days. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Dan Carpenter 提交于
bnxt_re_alloc_mw() doesn't return NULL, it returns error pointers. Fixes: 9152e0b7 ("RDMA/bnxt_re: HW workarounds for handling specific conditions") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Tatyana Nikolova 提交于
If the physical buffer list entries (PBLEs) of a QP are freed up at i40iw_dereg_mr, they can be assigned to a newly created QP before the previous QP is destroyed. Fix this by freeing PBLEs only when the QP is destroyed. Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Shiraz Saleem 提交于
Control Queue Pair (CQP) request objects, which have not received a completion upon interface close, remain in memory. To fix this, identify and free all pending CQP request objects during destroy CQP OP. Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Henry Orosco 提交于
To avoid infinite loop, in i40iw_ieq_handle_exception, update plist inside while loop. Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Henry Orosco 提交于
Add missing write memory barrier before writing the header containing valid bit to the WQE in i40iw_puda_send. Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Shiraz Saleem 提交于
Current flow leaves software QP structures in memory if Control Queue Pair (CQP) destroy QP OP fails. To fix this, free QP resources on fail of CQP destroy QP OP. Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Shiraz Saleem 提交于
On PCI function reset, cm_id reference is not released which causes an application hang, as it waits on the cm_id to be released on rdma_destroy. To fix this, call i40iw_cm_disconn during a PCI function reset to clean-up resources and release cm_id reference. Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Shiraz Saleem 提交于
Utilize iwdev->reset on a PCI function reset notification instead of passing in reset flag for resource clean-up. Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Mustafa Ismail 提交于
Control Queue Pair (CQP) OPs, in this case - Update SDs, cannot poll the Control Completion Queue (CCQ) after CCQ is destroyed. Instead, poll via registers. Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Mustafa Ismail 提交于
The order for calling i40iw_destroy_pble_pool is incorrect. Also, add PBLE_CHUNK_MEM init state to track pble pool creation and destruction. Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: NHenry Orosco <henry.orosco@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 18 7月, 2017 11 次提交
-
-
由 Tadeusz Struk 提交于
Playing with IP-O-IB interface can trigger a warning message: "ib0: Failed to modify QP to ERROR state" to be logged. This happens when the QP is in IB_QPS_RESET state and the stack is trying to transition it to IB_QPS_ERR state in ipoib_ib_dev_stop(). According to the IB spec, Table 91 - "QP State Transition Properties" it looks like the transition from reset to error is valid: Transition: Any State to Error Required Attributes: None Optional Attributes: None allowed Actions: Queue processing is stopped. Work Requests pending or in process are completed in error, when possible. This patch allows the transition and quiets the message. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 oulijun 提交于
This patch correct the comment style warnings caught by checkpatch.pl script. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 oulijun 提交于
When modified the MAC address used hns_roce_mac function, we release and create reserved qp again, It is not necessary to use spin_lock_bh and spin_unlock_bh in handle_en_event, Otherwise, it will occur a error. This patch mainly fixes it. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 oulijun 提交于
When opcode of work request is RDMA read and write, it should use rdma_wr to get remote_addr and rkey. This patch fixes it. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 oulijun 提交于
When destroyed rc qp, the hr_qp will be used after freed. This patch will fix it. Signed-off-by: NLijun Ou <oulijun@huawei.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 oulijun 提交于
In hip06 SoC, RoCE driver creates 8 reserved loopback QPs to ensure zero wqe when free mr. However, if the enabled phy port number is less than 6, it will fail in polling cqe with 8 reserved loopback QPs. In order to solve this problem, the number of loopback Qps will be adjusted based on the number of enabled phy port. Signed-off-by: NShaobo Xu <xushaobo2@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 yonatanc 提交于
The RXE coupled with dummy device causes to the kernel panic attached below. The panic happens when ib_register_device tries to set dma_mask by accessing a NULLed parent device. The RXE does not actually use DMA, so we can set the dma_mask to architecture value. [16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core] [16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246 [16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000 [16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 [16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f [16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000 [16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8 [16240.263314] FS: 00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000 [16240.267292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0 [16240.277066] Call Trace: [16240.281836] ? __kmalloc+0x26f/0x280 [16240.286596] rxe_register_device+0x297/0x300 [rdma_rxe] [16240.291377] rxe_add+0x535/0x5b0 [rdma_rxe] [16240.297586] rxe_net_add+0x3e/0xc0 [rdma_rxe] [16240.302375] rxe_param_set_add+0x65/0x144 [rdma_rxe] [16240.307769] param_attr_store+0x68/0xd0 [16240.311640] module_attr_store+0x1d/0x30 [16240.316421] sysfs_kf_write+0x3a/0x50 [16240.317802] kernfs_fop_write+0xff/0x180 [16240.322989] __vfs_write+0x37/0x140 [16240.328164] ? handle_mm_fault+0xce/0x240 [16240.333340] vfs_write+0xb2/0x1b0 [16240.335013] SyS_write+0x55/0xc0 [16240.340632] entry_SYSCALL_64_fastpath+0x1a/0xa9 Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com> Reviewed-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Yonatan Cohen 提交于
In the time between rxe_send has finished and skb destructor called, the QP's ref count might be 0, leading to a possible QP destruction. This will lead to a kernel panic when the destructor dereferences the QP. The operation of incrementing QP ref count at rxe_send and decrementing from skb destructor will prevent this crash. BUG: unable to handle kernel NULL pointer dereference at 000000000000072c IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] PGD 0 [16240.211178] Oops: 0002 [#1] SMP CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 4.9.0-mlnx #1 Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 task: ffff88042d6b1480 task.stack: ffffc90001904000 RIP: 0010:[<ffffffffa05df765>] [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] RSP: 0018:ffff88043fcc3df0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200 RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700 RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6 R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700 R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200 FS: 0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0 Stack: ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2 ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700 ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96 Call Trace: <IRQ> [16240.336345] [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0 [<ffffffff817b42c2>] skb_release_all+0x12/0x30 [<ffffffff817b4332>] kfree_skb+0x32/0x90 [<ffffffff81893f96>] ndisc_error_report+0x36/0x40 [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0 [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0 [<ffffffff81109295>] call_timer_fn+0x35/0x120 [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460 [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30 [<ffffffff810366b9>] ? sched_clock+0x9/0x10 [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0 [<ffffffff818dd537>] __do_softirq+0xd7/0x289 [<ffffffff810a6c95>] irq_exit+0xb5/0xc0 [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50 [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90 <EOI> [16240.395776] [<ffffffff818da156>] ? native_safe_halt+0x6/0x10 [<ffffffff818d9e6e>] default_idle+0x1e/0xd0 [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20 [<ffffffff818da2c5>] default_idle_call+0x35/0x40 [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210 [<ffffffff81050433>] start_secondary+0x103/0x130 RIP [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com> Reviewed-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Erez Shitrit 提交于
The driver checks if the lower level driver supports get_stats, and if so calls it to get the updated statistics, otherwise takes from the current netdevice stats object. Signed-off-by: NErez Shitrit <erezsh@mellanox.com> Reviewed-by: NAlex Vesker <valex@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Majd Dibbiny 提交于
Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs according to the netdev events. The ib_wq isn't an ordered workqueue and thus two work elements can be executed concurrently which will result in unexpected behavior and inconsistency of the GIDs cache content. Example: ifconfig eth1 11.11.11.11/16 up This command will invoke the following netdev events in the following order: 1. NETDEV_UP 2. NETDEV_DOWN 3. NETDEV_UP If (2) and (3) will be executed concurrently or in reverse order, instead of having a new GID with 11.11.11.11 IP, we will end up without any new GIDs. Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Leon Romanovsky 提交于
The failure in creation of debugfs entries for mr_cache left entries, which were already created. It caused to mismatch and misguiding for the end users. The solution is to clean mr_cache debugfs root, so no leftovers will be in the system. In addition, let's document why the error is not needed to be forwarded to user in case of failure. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-