1. 05 4月, 2018 1 次提交
    • M
      IB/uverbs: Add enum attribute type to ioctl() interface · 494c5580
      Matan Barak 提交于
      Methods sometimes need to get one attribute out of a group of
      pre-defined attributes. This is an enum-like behavior. Since
      this is a common requirement, we add a new ENUM attribute to the
      generic uverbs ioctl() layer. This attribute is embedded in methods,
      like any other attributes we currently have. ENUM attributes point to
      an array of standard UVERBS_ATTR_PTR_IN. The user-space encodes the
      enum's attribute id in the id field and the internal PTR_IN attr id in
      the enum_data.elem_id field. This ENUM attribute could be shared by
      several attributes and it can get UVERBS_ATTR_SPEC_F_MANDATORY flag,
      stating this attribute must be supported by the kernel, like any other
      attribute.
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      494c5580
  2. 04 4月, 2018 5 次提交
    • P
      RDMA: Use ib_gid_attr during GID modification · 414448d2
      Parav Pandit 提交于
      Now that ib_gid_attr contains device, port and index, simplify the
      provider APIs add_gid() and del_gid() to use device, port and index
      fields from the ib_gid_attr attributes structure.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      414448d2
    • P
      IB/core: Refactor GID modify code for RoCE · 598ff6ba
      Parav Pandit 提交于
      Code is refactored to prepare separate functions for RoCE which can do more
      complex operations related to reference counting, while still
      maintainining code readability. This includes
      (a) Simplification to not perform netdevice checks and modifications
      for IB link layer.
      (b) Do not add RoCE GID entry which has NULL netdevice; instead return
      an error.
      (c) If GID addition fails at provider level add_gid(), do not add the
      entry in the cache and keep the entry marked as INVALID.
      (d) Simplify and reuse the ib_cache_gid_add()/del() routines so that they
      can be used even for modifying default GIDs. This avoid some code
      duplication in modifying default GIDs.
      (e) find_gid() routine refers to the data entry flags to qualify a GID
      as valid or invalid GID rather than depending on attributes and zeroness
      of the GID content.
      (f) gid_table_reserve_default() sets the GID default attribute at
      beginning while setting up the GID table. There is no need to use
      default_gid flag in low level functions such as write_gid(), add_gid(),
      del_gid(), as they never need to update the DEFAULT property of the GID
      entry while during GID table update.
      
      As as result of this refactor, reserved GID 0:0:0:0:0:0:0:0 is no longer
      searchable as described below.
      
      A unicast GID entry of 0:0:0:0:0:0:0:0 is Reserved GID as per the IB
      spec version 1.3 section 4.1.1, point (6) whose snippet is below.
      
      "The unicast GID address 0:0:0:0:0:0:0:0 is reserved - referred to as
      the Reserved GID. It shall never be assigned to any endport. It shall
      not be used as a destination address or in a global routing header
      (GRH)."
      
      GID table cache now only stores valid GID entries. Before this patch,
      Reserved GID 0:0:0:0:0:0:0:0 was searchable in the GID table using
      ib_find_cached_gid_by_port() and other similar find routines.
      
      Zero GID is no longer searchable as it shall not to be present in GRH or
      path recored entry as described in IB spec version 1.3 section 4.1.1,
      point (6), section 12.7.10 and section 12.7.20.
      
      ib_cache_update() is simplified to check link layer once, use unified
      locking scheme for all link layers, removed temporary gid table
      allocation/free logic.
      
      Additionally,
      (a) Expand ib_gid_attr to store port and index so that GID query
      routines can get port and index information from the attribute structure.
      (b) Expand ib_gid_attr to store device as well so that in future code when
      GID reference counting is done, device is used to reach back to the GID
      table entry.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      598ff6ba
    • P
      IB/core: Simplify ib_query_gid to always refer to cache · f35faa4b
      Parav Pandit 提交于
      Currently following inconsistencies exist.
      1. ib_query_gid() returns GID from the software cache for a RoCE port
      and returns GID from the HCA for an IB port.
      This is incorrect because software GID cache is maintained regardless
      of HCA port type.
      
      2. GID is queries from the HCA via ib_query_gid and updated in the
      software cache for IB link layer. Both of them might not be in sync.
      
      ULPs such as SRP initiator, SRP target, IPoIB driver have historically
      used ib_query_gid() API to query the GID. However CM used cached version
      during CM processing, When software cache was introduced, this
      inconsitency remained.
      
      In order to simplify, improve readability and avoid link layer
      specific above inconsistencies, this patch brings following changes.
      
      1. ib_query_gid() always refers to the cache layer regardless of link
      layer.
      
      2. cache module who reads the GID entry from HCA and builds the cache,
      directly invokes the HCA provider verb's query_gid() callback function.
      
      3. ib_query_port() is being called in early stage where GID cache is not
      yet build while reading port immutable property. Therefore it needs to
      read the default GID from the HCA for IB link layer to publish the
      subnet prefix.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f35faa4b
    • P
      RDMA/providers: Simplify query_gid callback of RoCE providers · 0e1f9b92
      Parav Pandit 提交于
      ib_query_gid() fetches the GID from the software cache maintained in
      ib_core for RoCE ports.
      
      Therefore, simplify the provider drivers for RoCE to treat query_gid()
      callback as never called for RoCE, and only require non-RoCE devices to
      implement it.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0e1f9b92
    • R
      RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA device · 8435168d
      Roland Dreier 提交于
      Check to make sure that ctx->cm_id->device is set before we use it.
      Otherwise userspace can trigger a NULL dereference by doing
      RDMA_USER_CM_CMD_SET_OPTION on an ID that is not bound to a device.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: <syzbot+a67bc93e14682d92fc2f@syzkaller.appspotmail.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      8435168d
  3. 30 3月, 2018 4 次提交
  4. 29 3月, 2018 1 次提交
  5. 28 3月, 2018 7 次提交
  6. 24 3月, 2018 1 次提交
    • P
      IB/cma: Resolve route only while receiving CM requests · 114cc9c4
      Parav Pandit 提交于
      Currently CM request for RoCE follows following flow.
      rdma_create_id()
      rdma_resolve_addr()
      rdma_resolve_route()
      For RC QPs:
      rdma_connect()
      ->cma_connect_ib()
        ->ib_send_cm_req()
          ->cm_init_av_by_path()
            ->ib_init_ah_attr_from_path()
      For UD QPs:
      rdma_connect()
      ->cma_resolve_ib_udp()
        ->ib_send_cm_sidr_req()
          ->cm_init_av_by_path()
            ->ib_init_ah_attr_from_path()
      
      In both the flows, route is already resolved before sending CM requests.
      Therefore, code is refactored to avoid resolving route second time in
      ib_cm layer.
      ib_init_ah_attr_from_path() is extended to resolve route when it is not
      yet resolved for RoCE link layer. This is achieved by caller setting
      route_resolved field in path record whenever it has route already
      resolved.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      114cc9c4
  7. 23 3月, 2018 2 次提交
  8. 20 3月, 2018 9 次提交
  9. 17 3月, 2018 1 次提交
    • L
      RDMA/restrack: Don't rely on uninitialized variable in restrack_add flow · 7d9a935e
      Leon Romanovsky 提交于
      The restrack code relies on the fact that object structures are zeroed at
      the allocation stage, the mlx4 CQ wasn't allocated with kzalloc and it
      caused to the following crash.
      
      [  137.392209] general protection fault: 0000 [#1] SMP KASAN PTI
      [  137.392972] CPU: 0 PID: 622 Comm: ibv_rc_pingpong Tainted: G        W        4.16.0-rc1-00099-g00313983 #11
      [  137.395079] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [  137.396866] RIP: 0010:rdma_restrack_del+0xc8/0xf0
      [  137.397762] RSP: 0018:ffff8801b54e7968 EFLAGS: 00010206
      [  137.399008] RAX: 0000000000000000 RBX: ffff8801d8bcbae8 RCX: ffffffffb82314df
      [  137.400055] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: 70696b533d454741
      [  137.401103] RBP: ffff8801d90c07a0 R08: ffff8801d8bcbb00 R09: 0000000000000000
      [  137.402470] R10: 0000000000000001 R11: ffffed0036a9cf52 R12: ffff8801d90c0ad0
      [  137.403318] R13: ffff8801d853fb20 R14: ffff8801d8bcbb28 R15: 0000000000000014
      [  137.404736] FS:  00007fb415d43740(0000) GS:ffff8801e5c00000(0000) knlGS:0000000000000000
      [  137.406074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  137.407101] CR2: 00007fb41557df20 CR3: 00000001b580c001 CR4: 00000000003606b0
      [  137.408308] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  137.409352] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  137.410385] Call Trace:
      [  137.411058]  ib_destroy_cq+0x23/0x60
      [  137.411460]  uverbs_free_cq+0x37/0xa0
      [  137.412040]  remove_commit_idr_uobject+0x38/0xf0
      [  137.413042]  _rdma_remove_commit_uobject+0x5c/0x160
      [  137.413782]  ? lookup_get_idr_uobject+0x39/0x50
      [  137.414737]  rdma_remove_commit_uobject+0x3b/0x70
      [  137.415742]  ib_uverbs_destroy_cq+0x114/0x1d0
      [  137.416260]  ? ib_uverbs_req_notify_cq+0x160/0x160
      [  137.417073]  ? kernel_text_address+0x5c/0x90
      [  137.417805]  ? __kernel_text_address+0xe/0x30
      [  137.418766]  ? unwind_get_return_address+0x2f/0x50
      [  137.419558]  ib_uverbs_write+0x453/0x6a0
      [  137.420220]  ? show_ibdev+0x90/0x90
      [  137.420653]  ? __kasan_slab_free+0x136/0x180
      [  137.421155]  ? kmem_cache_free+0x78/0x1e0
      [  137.422192]  ? remove_vma+0x83/0x90
      [  137.422614]  ? do_munmap+0x447/0x6c0
      [  137.423045]  ? vm_munmap+0xb0/0x100
      [  137.423481]  ? SyS_munmap+0x1d/0x30
      [  137.424120]  ? do_syscall_64+0xeb/0x250
      [  137.424984]  ? entry_SYSCALL_64_after_hwframe+0x21/0x86
      [  137.425611]  ? lru_add_drain_all+0x270/0x270
      [  137.426116]  ? lru_add_drain_cpu+0xa3/0x170
      [  137.426616]  ? lru_add_drain+0x11/0x20
      [  137.427058]  ? free_pages_and_swap_cache+0xa6/0x120
      [  137.427672]  ? tlb_flush_mmu_free+0x78/0x90
      [  137.428168]  ? arch_tlb_finish_mmu+0x6d/0xb0
      [  137.428680]  __vfs_write+0xc4/0x350
      [  137.430917]  ? kernel_read+0xa0/0xa0
      [  137.432758]  ? remove_vma+0x90/0x90
      [  137.434781]  ? __kasan_slab_free+0x14b/0x180
      [  137.437486]  ? remove_vma+0x83/0x90
      [  137.439836]  ? kmem_cache_free+0x78/0x1e0
      [  137.442195]  ? percpu_counter_add_batch+0x1d/0x90
      [  137.444389]  vfs_write+0xf7/0x280
      [  137.446030]  SyS_write+0xa1/0x120
      [  137.447867]  ? SyS_read+0x120/0x120
      [  137.449670]  ? mm_fault_error+0x180/0x180
      [  137.451539]  ? _cond_resched+0x16/0x50
      [  137.453697]  ? SyS_read+0x120/0x120
      [  137.455883]  do_syscall_64+0xeb/0x250
      [  137.457686]  entry_SYSCALL_64_after_hwframe+0x21/0x86
      [  137.459595] RIP: 0033:0x7fb415637b94
      [  137.461315] RSP: 002b:00007ffdebea7d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [  137.463879] RAX: ffffffffffffffda RBX: 00005565022d1bd0 RCX: 00007fb415637b94
      [  137.466519] RDX: 0000000000000018 RSI: 00007ffdebea7da0 RDI: 0000000000000003
      [  137.469543] RBP: 00007ffdebea7d98 R08: 0000000000000000 R09: 00005565022d40c0
      [  137.472479] R10: 00000000000009cf R11: 0000000000000246 R12: 00005565022d2520
      [  137.475125] R13: 00000000000003e8 R14: 0000000000000000 R15: 00007ffdebea7fd0
      [  137.477760] Code: f7 e8 dd 0d 0b ff 48 c7 43 40 00 00 00 00 48 89 df e8 0d 0b 0b ff 48 8d 7b 28 c6 03 00 e8 41 0d 0b ff 48 8b 7b 28 48 85 ff 74 06 <f0> ff 4f 48 74 10 5b 48 89 ef 5d 41 5c 41 5d 41 5e e9 32 b0 ee
      [  137.483375] RIP: rdma_restrack_del+0xc8/0xf0 RSP: ffff8801b54e7968
      [  137.486436] ---[ end trace 81835a1ea6722eed ]---
      [  137.488566] Kernel panic - not syncing: Fatal exception
      [  137.491162] Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      
      Fixes: 00313983 ("RDMA/nldev: provide detailed CM_ID information")
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      7d9a935e
  10. 16 3月, 2018 9 次提交