提交 · f1430536e008cd3b70794e12c414c20d54aabec2 · openeuler / Kernel

26 3月, 2019 1 次提交

mlx4: Convert pv_id_table to XArray · f1430536

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f1430536

18 3月, 2019 1 次提交

IB/mlx4: Fix race condition between catas error reset and aliasguid flows · 587443e7

由 Jack Morgenstein 提交于 3月 06, 2019

Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.

Fixes: a0c64a17 ("mlx4: Add alias_guid mechanism")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

587443e7

27 2月, 2019 1 次提交

net: devlink: turn devlink into a built-in · f4b6bcc7

由 Jakub Kicinski 提交于 2月 25, 2019

Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module.  Now
we are struggling to invoke ethtool compat code reliably.

Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4b6bcc7

23 2月, 2019 2 次提交

RDMA: Handle ucontext allocations by IB/core · a2a074ef

由 Leon Romanovsky 提交于 2月 12, 2019

Following the PD conversion patch, do the same for ucontext allocations.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a2a074ef

IB/mlx4: Increase the timeout for CM cache · 2612d723

由 Håkon Bugge 提交于 2月 17, 2019

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping
in a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when
an entry has to evicted form the cache. But life is not perfect,
remote peers may die or be rebooted. Hence, it's a timeout to wipe out
a cache entry, when the PF driver assumes the remote peer has gone.

During workloads where a high number of QPs are destroyed concurrently,
excessive amount of CM DREQ retries has been observed

The problem can be demonstrated in a bare-metal environment, where two
nodes have instantiated 8 VFs each. This using dual ported HCAs, so we
have 16 vPorts per physical server.

64 processes are associated with each vPort and creates and destroys
one QP for each of the remote 64 processes. That is, 1024 QPs per
vPort, all in all 16K QPs. The QPs are created/destroyed using the
CM.

When tearing down these 16K QPs, excessive CM DREQ retries (and
duplicates) are observed. With some cat/paste/awk wizardry on the
infiniband_cm sysfs, we observe as sum of the 16 vPorts on one of the
nodes:

cm_rx_duplicates:
      dreq  2102
cm_rx_msgs:
      drep  1989
      dreq  6195
       rep  3968
       req  4224
       rtu  4224
cm_tx_msgs:
      drep  4093
      dreq 27568
       rep  4224
       req  3968
       rtu  3968
cm_tx_retries:
      dreq 23469

Note that the active/passive side is equally distributed between the
two nodes.

Enabling pr_debug in cm.c gives tons of:

[171778.814239] <mlx4_ib> mlx4_ib_multiplex_cm_handler: id{slave:
1,sl_cm_id: 0xd393089f} is NULL!

By increasing the CM_CLEANUP_CACHE_TIMEOUT from 5 to 30 seconds, the
tear-down phase of the application is reduced from approximately 90 to
50 seconds. Retries/duplicates are also significantly reduced:

cm_rx_duplicates:
      dreq  2460
[]
cm_tx_retries:
      dreq  3010
       req    47

Increasing the timeout further didn't help, as these duplicates and
retries stems from a too short CMA timeout, which was 20 (~4 seconds)
on the systems. By increasing the CMA timeout to 22 (~17 seconds), the
numbers fell down to about 10 for both of them.

Adjustment of the CMA timeout is not part of this commit.
Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Acked-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2612d723

16 2月, 2019 1 次提交

IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs · 89944450

由 Shamir Rabinovitch 提交于 2月 07, 2019

Now when we have the udata passed to all the ib_xxx object creation APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

89944450

09 2月, 2019 1 次提交

RDMA: Handle PD allocations by IB/core · 21a428a0

由 Leon Romanovsky 提交于 2月 03, 2019

The PD allocations in IB/core allows us to simplify drivers and their
error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand
in had with relevant update in .dealloc_pd().

We will use this opportunity and convert .dealloc_pd() to don't fail, as
it was suggested a long time ago, failures are not happening as we have
never seen a WARN_ON print.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

21a428a0

31 1月, 2019 1 次提交

RDMA: Provide safe ib_alloc_device() function · 459cc69f

由 Leon Romanovsky 提交于 1月 30, 2019

All callers to ib_alloc_device() provide a larger size than struct
ib_device and rely on the fact that struct ib_device is embedded in their
driver specific structure as the first member.

Provide a safer variant of ib_alloc_device() that checks and enforces this
approach to make sure the drivers are using it right.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

459cc69f

22 1月, 2019 1 次提交

IB/mlx4: Fix using wrong function to destroy sqp AHs under SRIOV · f45f8edb

由 Jack Morgenstein 提交于 1月 15, 2019

The commit cited below replaced rdma_create_ah with
mlx4_ib_create_slave_ah when creating AHs for the paravirtualized special
QPs.

However, this change also required replacing rdma_destroy_ah with
mlx4_ib_destroy_ah in the affected flows.

The commit missed 3 places where rdma_destroy_ah should have been replaced
with mlx4_ib_destroy_ah.

As a result, the pd usecount was decremented when the ah was destroyed --
although the usecount was NOT incremented when the ah was created.

This caused the pd usecount to become negative, and resulted in the
WARN_ON stack trace below when the mlx4_ib.ko module was unloaded:

WARNING: CPU: 3 PID: 25303 at drivers/infiniband/core/verbs.c:329 ib_dealloc_pd+0x6d/0x80 [ib_core]
Modules linked in: rdma_ucm rdma_cm iw_cm ib_cm ib_umad mlx4_ib(-) ib_uverbs ib_core mlx4_en mlx4_core nfsv3 nfs fscache configfs xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc dm_mirror dm_region_hash dm_log dm_mod dax rndis_wlan rndis_host coretemp kvm_intel cdc_ether kvm usbnet iTCO_wdt iTCO_vendor_support cfg80211 irqbypass lpc_ich ipmi_si i2c_i801 mii pcspkr i2c_core mfd_core ipmi_devintf i7core_edac ipmi_msghandler ioatdma pcc_cpufreq dca acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi mptsas scsi_transport_sas mptscsih crc32c_intel ata_piix bnx2 mptbase ipv6 crc_ccitt autofs4 [last unloaded: mlx4_core]
CPU: 3 PID: 25303 Comm: modprobe Tainted: G        W I       5.0.0-rc1-net-mlx4+ #1
Hardware name: IBM  -[7148ZV6]-/Node 1, System Card, BIOS -[MLE170CUS-1.70]- 09/23/2011
RIP: 0010:ib_dealloc_pd+0x6d/0x80 [ib_core]
Code: 00 00 85 c0 75 02 5b c3 80 3d aa 87 03 00 00 75 f5 48 c7 c7 88 d7 8f a0 31 c0 c6 05 98 87 03 00 01 e8 07 4c 79 e0 0f 0b 5b c3 <0f> 0b eb be 0f 0b eb ab 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66
RSP: 0018:ffffc90005347e30 EFLAGS: 00010282
RAX: 00000000ffffffea RBX: ffff8888589e9540 RCX: 0000000000000006
RDX: 0000000000000006 RSI: ffff88885d57ad40 RDI: 0000000000000000
RBP: ffff88885b029c00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000004 R12: ffff8887f06c0000
R13: ffff8887f06c13e8 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fd6743c6740(0000) GS:ffff88887fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000ed1038 CR3: 00000007e3156000 CR4: 00000000000006e0
Call Trace:
 mlx4_ib_close_sriov+0x125/0x180 [mlx4_ib]
 mlx4_ib_remove+0x57/0x1f0 [mlx4_ib]
 mlx4_remove_device+0x92/0xa0 [mlx4_core]
 mlx4_unregister_interface+0x39/0x90 [mlx4_core]
 mlx4_ib_cleanup+0xc/0xd7 [mlx4_ib]
 __x64_sys_delete_module+0x17d/0x290
 ? trace_hardirqs_off_thunk+0x1a/0x1c
 ? do_syscall_64+0x12/0x180
 do_syscall_64+0x4a/0x180
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 5e62d5ff ("IB/mlx4: Create slave AH's directly")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f45f8edb

15 1月, 2019 2 次提交

RDMA: Introduce and use rdma_device_to_ibdev() · 54747231

由 Parav Pandit 提交于 12月 18, 2018

Introduce and use rdma_device_to_ibdev() API for those drivers which are
registering one sysfs group and also use in ib_core.

In subsequent patch, device->provider_ibdev one-to-one mapping is no
longer holds true during accessing sysfs entries.
Therefore, introduce an API rdma_device_to_ibdev() that provides such
information.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

54747231

RDMA: Rename port_callback to init_port · ea4baf7f

由 Parav Pandit 提交于 12月 18, 2018

Most provider routines are callback routines which ib core invokes.
_callback suffix doesn't convey information about when such callback is
invoked. Therefore, rename port_callback to init_port.

Additionally, store the init_port function pointer in ib_device_ops, so
that it can be accessed in subsequent patches when binding rdma device to
net namespace.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

ea4baf7f

11 1月, 2019 3 次提交

RDMA: Clear CQ objects during their allocation · 0975890e

由 Leon Romanovsky 提交于 1月 09, 2019

As part of an audit process to update drivers to use rdma_restrack_add()
ensure that CQ objects is cleared before access.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0975890e

IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata · b0ea0fa5

由 Jason Gunthorpe 提交于 1月 09, 2019

ib_umem_get() can only be called in a method callback, which always has a
udata parameter. This allows ib_umem_get() to derive the ucontext pointer
directly from the udata without requiring the drivers to find it in some
way or another.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>

b0ea0fa5

IB/{core,uverbs}: Move ib_umem_xxx functions from ib_core to ib_uverbs · 6fa8f1af

由 Shamir Rabinovitch 提交于 1月 09, 2019

The next patch will add dependency from ib_umem_get in to ib_uverbs so
move the required ib_umem_xxx functionality to it's correct module -
ib_uverbs - and avoid circular dependecy from the form of ib_core ->
ib_uverbs -> ib_core in depmod.

Since this now requires all drivers to be build modular if uverbs is
modular, hoist the test a couple drivers had into the main kconfig and
apply it to all drivers uniformly.
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

6fa8f1af

21 12月, 2018 1 次提交

IB/mlx4: Remove set but not used variable 'pd' · e7c4d8e6

由 YueHaibing 提交于 12月 21, 2018

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/mlx4/qp.c: In function '_mlx4_ib_destroy_qp':
drivers/infiniband/hw/mlx4/qp.c:1612:22: warning:
 variable 'pd' set but not used [-Wunused-but-set-variable]

Fixes: e00b64f7 ("RDMA: Cleanup undesired pd->uobject usage")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e7c4d8e6

20 12月, 2018 2 次提交

RDMA: Mark if destroy address handle is in a sleepable context · 2553ba21

由 Gal Pressman 提交于 12月 12, 2018

Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.
Signed-off-by: NGal Pressman <galpress@amazon.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2553ba21

RDMA: Mark if create address handle is in a sleepable context · b090c4e3

由 Gal Pressman 提交于 12月 12, 2018

Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.
Signed-off-by: NGal Pressman <galpress@amazon.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b090c4e3

19 12月, 2018 2 次提交

RDMA: Cleanup undesired pd->uobject usage · e00b64f7

由 Shamir Rabinovitch 提交于 12月 17, 2018

Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e00b64f7

IB/mlx4: Utilize macro to calculate SQ spare size · 350b4c8a

由 Yuval Shaia 提交于 12月 11, 2018

The macro MLX4_IB_SQ_HEADROOM calculates the spare room needed to be
left. Use it instead of hard-coding the HW prefetch size.
Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

350b4c8a

12 12月, 2018 3 次提交

RDMA: Start use ib_device_ops · 3023a1e9

由 Kamal Heib 提交于 12月 10, 2018

Make all the required change to start use the ib_device_ops structure.
Signed-off-by: NKamal Heib <kamalheib1@gmail.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

3023a1e9

RDMA/mlx4: Initialize ib_device_ops struct · 4725c4ba

由 Kamal Heib 提交于 12月 10, 2018

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: NKamal Heib <kamalheib1@gmail.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

4725c4ba

IB/mlx4: Remove unneeded NULL check · 2dd8e44c

由 Yuval Shaia 提交于 12月 11, 2018

NULL check for kfree is unnecessary, remove it.

Fixes: b42dde47 ("IB/mlx4: Rework special QP creation error path")
Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2dd8e44c

07 12月, 2018 1 次提交

mlx4: Use snprintf instead of complicated strcpy · 0fbc9b8b

由 Qian Cai 提交于 11月 29, 2018

This fixes a compilation warning in sysfs.c

drivers/infiniband/hw/mlx4/sysfs.c:360:2: warning: 'strncpy' output may be
truncated copying 8 bytes from a string of length 31
[-Wstringop-truncation]

By eliminating the temporary stack buffer.
Signed-off-by: NQian Cai <cai@gmx.us>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0fbc9b8b

23 11月, 2018 2 次提交

{net, IB}/mlx4: Initialize CQ buffers in the driver when possible · e4567897

由 Daniel Jurgens 提交于 11月 21, 2018

Perform CQ initialization in the driver when the capability is supported
by the FW.  When passing the CQ to HW indicate that the CQ buffer has
been pre-initialized.

Doing so decreases CQ creation time.  Testing on P8 showed a single 2048
entry CQ creation time was reduced from ~395us to ~170us, which is
2.3x faster.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4567897

mlx4: trigger IB events needed by SMC · fc6526fb

由 Ursula Braun 提交于 11月 12, 2018

The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the RoCE
network interface is activated. When SMC determines the RoCE device port
to be used, it checks the port states. This patch triggers IB events for
NETDEV_UP and NETDEV_DOWN.
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

fc6526fb

17 10月, 2018 4 次提交

RDMA/drivers: Use core provided API for registering device attributes · 508a523f

由 Parav Pandit 提交于 10月 11, 2018

Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

508a523f

IB/mlx4: Add port and TID to MAD debug print · b4c542df

由 Håkon Bugge 提交于 10月 09, 2018

Add said information and make the debug print format consistent.
Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b4c542df

IB/mlx4: Enable debug print of SMPs · 0a094ff0

由 Håkon Bugge 提交于 10月 09, 2018

IB Subnet Management Packets (SMPs) were excluded from debug prints.

Fixed by enabling print even on QP0 MADs.
Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0a094ff0

IB/mlx4: Refer to the device kobject instead of ports_parent · 60f1fc20

由 Parav Pandit 提交于 10月 07, 2018

iov sysfs tree is created under ib device at
/sys/class/infiniband/mlx4_0/iov.
And,
ibdev->ports_parent->parent = &ibdev->dev.

Therefore, refer to device's kobject directly instead of
indirect access to it.

Additionally, iov entries are created under device kobject and deleted
before device is removed. There is no need to hold additional reference
to device kobject in provider driver.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

60f1fc20

04 10月, 2018 1 次提交

RDMA: Remove unused parameter from ib_modify_qp_is_ok() · d31131bb

由 Kamal Heib 提交于 10月 02, 2018

The ll parameter is not used in ib_modify_qp_is_ok(), so remove it.
Signed-off-by: NKamal Heib <kamalheib1@gmail.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

d31131bb

27 9月, 2018 2 次提交

RDMA: Fully setup the device name in ib_register_device · e349f858

由 Jason Gunthorpe 提交于 9月 25, 2018

The current code has two copies of the device name, ibdev->dev and
dev_name(&ibdev->dev), and they are setup at different times, which is
very confusing.

Set them both up at the same time and make dev_name() the lead name, which
is the proper use of the driver core APIs. To make it very clear that the
name is not valid until registration pass it in to the
ib_register_device() call rather than messing with ibdev->name directly.

Also the reorganization now checks that dev_name is unique even if it does
not contain a %.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Acked-by: NAdit Ranadive <aditr@vmware.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Acked-by: NDevesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: NShiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>

e349f858

RDMA: Fix dependencies for rdma_user_mmap_io · 46bdf777

由 Arnd Bergmann 提交于 9月 26, 2018

The mlx4 driver produces a link error when it is configured
as built-in while CONFIG_INFINIBAND_USER_ACCESS is set to =m:

drivers/infiniband/hw/mlx4/main.o: In function `mlx4_ib_mmap':
main.c:(.text+0x1af4): undefined reference to `rdma_user_mmap_io'

The same function is called from mlx5, which already has a
dependency to ensure we can call it, and from hns, which
appears to suffer from the same problem.

This adds the same dependency that mlx5 uses to the other two.

Fixes: 6745d356 ("RDMA/hns: Use rdma_user_mmap_io")
Fixes: c282da41 ("RDMA/mlx4: Use rdma_user_mmap_io")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

46bdf777

22 9月, 2018 1 次提交

IB/mlx4: Remove unnecessary parentheses · fa8f1158

由 Nathan Chancellor 提交于 9月 19, 2018

Clang warns when more than one set of parentheses are used in single
conditional statements.

drivers/infiniband/hw/mlx4/mcg.c:676:16: warning: equality comparison
with extraneous parentheses [-Wparentheses-equality]
                        if ((method == IB_MGMT_METHOD_GET_RESP)) {
                             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/mlx4/mcg.c:676:16: note: remove extraneous
parentheses around the comparison to silence this warning
                        if ((method == IB_MGMT_METHOD_GET_RESP)) {
                            ~       ^                         ~
drivers/infiniband/hw/mlx4/mcg.c:676:16: note: use '=' to turn this
equality comparison into an assignment
                        if ((method == IB_MGMT_METHOD_GET_RESP)) {
                                    ^~
                                    =

Remove the unnecessary parentheses to silence this warning.
Reported-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fa8f1158

21 9月, 2018 1 次提交

RDMA/mlx4: Use rdma_user_mmap_io · c282da41

由 Jason Gunthorpe 提交于 9月 16, 2018

Rely on the new core code helper to map BAR memory from the driver.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c282da41

07 9月, 2018 1 次提交

RDMA/mlx4: Ensure that maximal send/receive SGE less than supported by HW · 8f28b178

由 Leon Romanovsky 提交于 9月 03, 2018

In calculating the global maximum number of the Scatter/Gather elements
supported, the following four maximum parameters must be taken into
consideration: max_sg_rq, max_sg_sq, max_desc_sz_rq and max_desc_sz_sq.

However instead of bringing this complexity to query_device, which still
won't be sufficient anyway (the calculations are dependent on QP type),
the safer approach will be to restore old code, which will give us 32
SGEs.

Fixes: 33023fb8 ("IB/core: add max_send_sge and max_recv_sge attributes")
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8f28b178

31 7月, 2018 3 次提交

IB/mlx4: Use 4K pages for kernel QP's WQE buffer · f95ccffc

由 Jack Morgenstein 提交于 7月 26, 2018

In the current implementation, the driver tries to allocate contiguous
memory, and if it fails, it falls back to 4K fragmented allocation.

Once the memory is fragmented, the first allocation might take a lot
of time, and even fail, which can cause connection failures.

This patch changes the logic to always allocate with 4K granularity,
since it's more robust and more likely to succeed.

This patch was tested with Lustre and no performance degradation
was observed.

Note: This commit eliminates the "shrinking WQE" feature. This feature
depended on using vmap to create a virtually contiguous send WQ.
vmap use was abandoned due to problems with several processors (see the
commit cited below). As a result, shrinking WQE was available only with
physically contiguous send WQs. Allocating such send WQs caused the
problems described above.
Therefore, as a side effect of eliminating the use of large physically
contiguous send WQs, the shrinking WQE feature became unavailable.

Warning example:
worker/20:1: page allocation failure: order:8, mode:0x80d0
CPU: 20 PID: 513 Comm: kworker/20:1 Tainted: G OE ------------
Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
[<ffffffff81686d81>] dump_stack+0x19/0x1b
[<ffffffff81186160>] warn_alloc_failed+0x110/0x180
[<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0
[<ffffffff811ce868>] alloc_pages_current+0x98/0x110
[<ffffffff81184fae>] __get_free_pages+0xe/0x50
[<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150
[<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50
[<ffffffffa056b4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
[<ffffffffa056b73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
[<ffffffffa0b15496>] create_qp_common+0x536/0x1000 [mlx4_ib]
[<ffffffff811c6ef7>] ? dma_pool_free+0xa7/0xd0
[<ffffffffa0b163c1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib]
[<ffffffffa0b01bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib]
[<ffffffffa0b21f20>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
[<ffffffffa08f152a>] ib_create_qp+0x7a/0x2f0 [ib_core]
[<ffffffffa06205d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
[<ffffffffa08275c9>] kiblnd_create_conn+0xbf9/0x1950 [ko2iblnd]
[<ffffffffa074077a>] ? cfs_percpt_unlock+0x1a/0xb0 [libcfs]
[<ffffffffa0835519>] kiblnd_passive_connect+0xa99/0x18c0 [ko2iblnd]

Fixes: 73898db0 ("net/mlx4: Avoid wrong virtual mappings")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f95ccffc

RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const · d34ac5cd

由 Bart Van Assche 提交于 7月 18, 2018

Since neither ib_post_send() nor ib_post_recv() modify the data structure
their second argument points at, declare that argument const. This change
makes it necessary to declare the 'bad_wr' argument const too and also to
modify all ULPs that call ib_post_send(), ib_post_recv() or
ib_post_srq_recv(). This patch does not change any functionality but makes
it possible for the compiler to verify whether the
ib_post_(send|recv|srq_recv) really do not modify the posted work request.

To make this possible, only one cast had to be introduce that casts away
constness, namely in rpcrdma_post_recvs(). The only way I can think of to
avoid that cast is to introduce an additional loop in that function or to
change the data type of bad_wr from struct ib_recv_wr ** into int
(an index that refers to an element in the work request list). However,
both approaches would require even more extensive changes than this
patch.
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

d34ac5cd

RDMA: Constify the argument of the work request conversion functions · f696bf6d

由 Bart Van Assche 提交于 7月 18, 2018

When posting a send work request, the work request that is posted is not
modified by any of the RDMA drivers. Make this explicit by constifying
most ib_send_wr pointers in RDMA transport drivers.
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f696bf6d

11 7月, 2018 1 次提交

RDMA: Fix storage of PortInfo CapabilityMask in the kernel · 2f944c0f

由 Jason Gunthorpe 提交于 7月 04, 2018

The internal flag IP_BASED_GIDS was added to a field that was being used
to hold the port Info CapabilityMask without considering the effects this
will have. Since most drivers just use the value from the HW MAD it means
IP_BASED_GIDS will also become set on any HW that sets the IBA flag
IsOtherLocalChangesNoticeSupported - which is not intended.

Fix this by keeping port_cap_flags only for the IBA CapabilityMask value
and store unrelated flags externally. Move the bit definitions for this to
ib_mad.h to make it clear what is happening.

To keep the uAPI unchanged define a new set of flags in the uapi header
that are only used by ib_uverbs_query_port_resp.port_cap_flags which match
the current flags supported in rdma-core, and the values exposed by the
current kernel.

Fixes: b4a26a27 ("IB: Report using RoCE IP based gids in port caps")
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>

2f944c0f

05 7月, 2018 1 次提交

IB/mlx4: Test port number before querying type. · f1228867

由 Tarick Bedeir 提交于 7月 02, 2018

rdma_ah_find_type() can reach into ib_device->port_immutable with a
potentially out-of-bounds port number, so check that the port number is
valid first.

Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: NTarick Bedeir <tarick@google.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f1228867

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功