1. 20 7月, 2017 29 次提交
  2. 18 7月, 2017 11 次提交
    • T
      IB/core: Allow QP state transition from reset to error · ebc9ca43
      Tadeusz Struk 提交于
      Playing with IP-O-IB interface can trigger a warning message:
      "ib0: Failed to modify QP to ERROR state" to be logged.
      This happens when the QP is in IB_QPS_RESET state and the stack
      is trying to transition it to IB_QPS_ERR state in ipoib_ib_dev_stop().
      
      According to the IB spec, Table 91 - "QP State Transition Properties"
      it looks like the transition from reset to error is valid:
      
      Transition: Any State to Error
      Required Attributes: None
      Optional Attributes: None allowed
      Actions: Queue processing is stopped. Work Requests pending or in
      process are completed in error, when possible.
      
      This patch allows the transition and quiets the message.
      Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      ebc9ca43
    • O
      IB/hns: Fix for checkpatch.pl comment style warnings · 5f110ac4
      oulijun 提交于
      This patch correct the comment style warnings caught by
      checkpatch.pl script.
      Signed-off-by: NLijun Ou <oulijun@huawei.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5f110ac4
    • O
      IB/hns: Fix the bug with modifying the MAC address without removing the driver · d322f004
      oulijun 提交于
      When modified the MAC address used hns_roce_mac function, we release and create
      reserved qp again, It is not necessary to use spin_lock_bh and spin_unlock_bh in
      handle_en_event, Otherwise, it will occur a error. This patch mainly fixes it.
      Signed-off-by: NLijun Ou <oulijun@huawei.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d322f004
    • O
      IB/hns: Fix the bug with rdma operation · 9de61d3f
      oulijun 提交于
      When opcode of work request is RDMA read and write, it
      should use rdma_wr to get remote_addr and rkey. This
      patch fixes it.
      Signed-off-by: NLijun Ou <oulijun@huawei.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9de61d3f
    • O
      IB/hns: Fix the bug with wild pointer when destroy rc qp · 58c4f0d8
      oulijun 提交于
      When destroyed rc qp, the hr_qp will be used after freed. This patch
      will fix it.
      Signed-off-by: NLijun Ou <oulijun@huawei.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      58c4f0d8
    • O
      IB/hns: Fix the bug of polling cq failed for loopback Qps · 5802883d
      oulijun 提交于
      In hip06 SoC, RoCE driver creates 8 reserved loopback QPs to
      ensure zero wqe when free mr. However, if the enabled phy
      port number is less than 6, it will fail in polling cqe with
      8 reserved loopback QPs.
      
      In order to solve this problem, the number of loopback Qps
      will be adjusted based on the number of enabled phy port.
      Signed-off-by: NShaobo Xu <xushaobo2@huawei.com>
      Signed-off-by: NLijun Ou <oulijun@huawei.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5802883d
    • Y
      IB/rxe: Set dma_mask and coherent_dma_mask · 56012e1c
      yonatanc 提交于
      The RXE coupled with dummy device causes to the kernel panic attached
      below.  The panic happens when ib_register_device tries to set dma_mask
      by accessing a NULLed parent device.
      
      The RXE does not actually use DMA, so we can set the dma_mask
      to architecture value.
      
      [16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core]
      [16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246
      [16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000
      [16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
      [16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f
      [16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000
      [16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8
      [16240.263314] FS:  00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000
      [16240.267292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0
      [16240.277066] Call Trace:
      [16240.281836]  ? __kmalloc+0x26f/0x280
      [16240.286596]  rxe_register_device+0x297/0x300 [rdma_rxe]
      [16240.291377]  rxe_add+0x535/0x5b0 [rdma_rxe]
      [16240.297586]  rxe_net_add+0x3e/0xc0 [rdma_rxe]
      [16240.302375]  rxe_param_set_add+0x65/0x144 [rdma_rxe]
      [16240.307769]  param_attr_store+0x68/0xd0
      [16240.311640]  module_attr_store+0x1d/0x30
      [16240.316421]  sysfs_kf_write+0x3a/0x50
      [16240.317802]  kernfs_fop_write+0xff/0x180
      [16240.322989]  __vfs_write+0x37/0x140
      [16240.328164]  ? handle_mm_fault+0xce/0x240
      [16240.333340]  vfs_write+0xb2/0x1b0
      [16240.335013]  SyS_write+0x55/0xc0
      [16240.340632]  entry_SYSCALL_64_fastpath+0x1a/0xa9
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      56012e1c
    • Y
      IB/rxe: Fix kernel panic from skb destructor · fda85ce9
      Yonatan Cohen 提交于
      In the time between rxe_send has finished and skb destructor
      called, the QP's ref count might be 0, leading to a possible
      QP destruction. This will lead to a kernel panic when the destructor
      dereferences the QP.
      
      The operation of incrementing QP ref count at rxe_send and decrementing
      from skb destructor will prevent this crash.
      
      BUG: unable to handle kernel NULL pointer dereference at 000000000000072c
      IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      PGD 0 [16240.211178]
      Oops: 0002 [#1] SMP
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE   4.9.0-mlnx #1
      Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
      task: ffff88042d6b1480 task.stack: ffffc90001904000
      RIP: 0010:[<ffffffffa05df765>]  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      RSP: 0018:ffff88043fcc3df0  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200
      RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700
      RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6
      R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700
      R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200
      FS:  0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0
      Stack:
       ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2
       ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700
       ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96
      Call Trace:
       <IRQ> [16240.336345]  [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0
       [<ffffffff817b42c2>] skb_release_all+0x12/0x30
       [<ffffffff817b4332>] kfree_skb+0x32/0x90
       [<ffffffff81893f96>] ndisc_error_report+0x36/0x40
       [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0
       [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0
       [<ffffffff81109295>] call_timer_fn+0x35/0x120
       [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460
       [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30
       [<ffffffff810366b9>] ? sched_clock+0x9/0x10
       [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0
       [<ffffffff818dd537>] __do_softirq+0xd7/0x289
       [<ffffffff810a6c95>] irq_exit+0xb5/0xc0
       [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50
       [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90
       <EOI> [16240.395776]  [<ffffffff818da156>] ? native_safe_halt+0x6/0x10
       [<ffffffff818d9e6e>] default_idle+0x1e/0xd0
       [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20
       [<ffffffff818da2c5>] default_idle_call+0x35/0x40
       [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210
       [<ffffffff81050433>] start_secondary+0x103/0x130
      RIP  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      fda85ce9
    • E
      IB/ipoib: Let lower driver handle get_stats64 call · b6c871e5
      Erez Shitrit 提交于
      The driver checks if the lower level driver supports get_stats, and if
      so calls it to get the updated statistics, otherwise takes from the
      current netdevice stats object.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      b6c871e5
    • M
      IB/core: Add ordered workqueue for RoCE GID management · 8fe8bacb
      Majd Dibbiny 提交于
      Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs
      according to the netdev events.
      
      The ib_wq isn't an ordered workqueue and thus two work elements can be executed
      concurrently which will result in unexpected behavior and inconsistency of the
      GIDs cache content.
      
      Example:
      ifconfig eth1 11.11.11.11/16 up
      
      This command will invoke the following netdev events in the following order:
      1. NETDEV_UP
      2. NETDEV_DOWN
      3. NETDEV_UP
      
      If (2) and (3) will be executed concurrently or in reverse order, instead of
      having a new GID with 11.11.11.11 IP, we will end up without any new GIDs.
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8fe8bacb
    • L
      IB/mlx5: Clean mr_cache debugfs in case of failure · 12cc1a02
      Leon Romanovsky 提交于
      The failure in creation of debugfs entries for mr_cache left entries,
      which were already created.
      
      It caused to mismatch and misguiding for the end users. The solution
      is to clean mr_cache debugfs root, so no leftovers will be in the
      system. In addition, let's document why the error is not needed to be
      forwarded to user in case of failure.
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      12cc1a02