提交 · c5c4d92e70f37369b5bdca5e85f9fc55dc2c8a3b · openeuler / Kernel

06 9月, 2018 3 次提交

RDMA/uverbs: Use cdev_device_add() instead of cdev_add() · c5c4d92e

由 Parav Pandit 提交于 9月 05, 2018

Instead of doing two step process to add char device and create underlying
device, use cdev_device_add() which does both.

Currently a kobject per uverbs_device is created to keep reference to its
holding ib_uverbs_device in addition to its underlying device 'dev'.

Instead just use uverbs_device->dev to keep a reference to.

With this change there is single reference tracker for ib_uverbs_device
structure.

This allows for subsequent patch to registers group attribute as well
using single API cdev_device_add().
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

c5c4d92e

RDMA/core: Depend on device_add() to add device attributes · adee9f3f

由 Parav Pandit 提交于 9月 05, 2018

Instead of adding/removing device attribute files, depend on device_add()
which considers adding these device files based on NULL terminated
attributes group array.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

adee9f3f

RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one() · 08e74be1

由 Parav Pandit 提交于 9月 05, 2018

If ib_uverbs_create_uapi() fails, dev_num should be freed from the bitmap.

Fixes: 7d96c9b1 ("IB/uverbs: Have the core code create the uverbs_root_spec")
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

08e74be1

05 9月, 2018 2 次提交

IB/core: Release object lock if destroy failed · e4ff3d22

由 Artemy Kovalyov 提交于 8月 28, 2018

The object lock was supposed to always be released during destroy, but
when the destruction retry series was integrated with the destroy series
it created a failure path that missed the unlock.

Keep with convention, if destroy fails the caller must undo all locking.

Fixes: 87ad80ab ("IB/uverbs: Consolidate uobject destruction")
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e4ff3d22

RDMA/ucma: check fd type in ucma_migrate_id() · 0d23ba60

由 Jann Horn 提交于 9月 03, 2018

The current code grabs the private_data of whatever file descriptor
userspace has supplied and implicitly casts it to a `struct ucma_file *`,
potentially causing a type confusion.

This is probably fine in practice because the pointer is only used for
comparisons, it is never actually dereferenced; and even in the
comparisons, it is unlikely that a file from another filesystem would have
a ->private_data pointer that happens to also be valid in this context.
But ->private_data is not always guaranteed to be a valid pointer to an
object owned by the file's filesystem; for example, some filesystems just
cram numbers in there.

Check the type of the supplied file descriptor to be safe, analogous to how
other places in the kernel do it.

Fixes: 88314e4d ("RDMA/cma: add support for rdma_migrate_id()")
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0d23ba60

23 8月, 2018 1 次提交

mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7

由 Michal Hocko 提交于 8月 21, 2018

There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: NDavid Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

93065ac7

16 8月, 2018 5 次提交

IB/core: Change filter function return type from int to bool · dd81b2c8

由 Parav Pandit 提交于 8月 14, 2018

Filter functions returns either 0 or 1, therefore better change their
return type from int to bool to reflect the same. Additionally some
filter functions have suffix of _filter some doesn't. Make all filter
function consistent to have __filter suffix to improve code readability.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

dd81b2c8

IB/core: Update GID entries for netdevice whose mac address changes · d12e2eed

由 Parav Pandit 提交于 8月 14, 2018

Update all GID table entries of the netdevice whose MAC address changed.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

d12e2eed

IB/core: Add default GIDs of the bond master netdev · 464b79b4

由 Parav Pandit 提交于 8月 14, 2018

Currently following issues exist:
1. Default GIDs of the lower (slave) netdevice if the bond netdevice is
   added. Rather default GID should be of bond master netdevice.
2. Due to this, when failover event occurs FAILOVER event handler attempts
   to delete the GID of the upper device and tries to add the default GID
   of the lower device. This is incorrect behavior.

To have simple and correct code:
(a) Split default GIDs addition out of add_netdev_ips().  This allows
    easier removal in future if RoCE default GIDs are removed.
(b) Add default GIDs of the bond master device by using right filter and
    callback function.
(c) Remove unused function enum_netdev_default_gids().
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

464b79b4

IB/core: Consider adding default GIDs of bond device · a03d4d27

由 Parav Pandit 提交于 8月 14, 2018

Now that we correctly delete the default GIDs of lower devices during
CHANGEUPPER event, add default GIDs of the bonding master device.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a03d4d27

IB/core: Delete lower netdevice default GID entries in bonding scenario · 408f1242

由 Parav Pandit 提交于 8月 14, 2018

When NETDEV_CHANGEUPPER event occurs, lower device is not yet established
as slave of the master, and when upper device is bond device, default GID
entries not deleted.

Due to this, when bond device is fully configured, default GID entries of
bond device cannot be added as default GID entries are occupied by the
lower netdevice. This is incorrect.

Default GID entries should really be of bond netdevice because in all RoCE
GIDs (default or IP), MAC address of the bond device will be used.  It is
confusing to have default GID of netdevice which is not really used for
any purpose.

Therefore, as first step, implement
(a) filter function which filters if a CHANGEUPPER event netdevice and
    associated upper device is master device or not.
(b) callback function which deletes the default GIDs of lower (event
    netdevice).
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

408f1242

15 8月, 2018 2 次提交

IB/core: Avoid confusing del_netdev_default_ips · b9f09866

由 Parav Pandit 提交于 8月 14, 2018

Currently bond_delete_netdev_default_gids() is called by two callers.
(a) del_netdev_default_ips_join()
(b) del_netdev_default_ips()

Both above functions changes the argument order while calling
bond_delete_netdev_default_gids().  This required silly
del_netdev_default_ips() wrapper.

Additionally, del_netdev_default_ips() deletes default GIDs not IP based
GIDs.  del_netdev_default_ips() having _ips suffix is confusing.

Therefore, get rid of confusing del_netdev_default_ips() and simplify
bond_delete_netdev_default_gids() to follow same argument order as its
caller.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b9f09866

IB/core: Add comment for change upper netevent handling · 666e7099

由 Parav Pandit 提交于 8月 14, 2018

Add comment for handling CHANGEUPPER netevent handling.
To improve code readability,
(a) move cmd definitions to its respective if-else branches,
(b) avoid single line structure definitions.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

666e7099

14 8月, 2018 1 次提交

IB/ucm: Fix compiling ucm.c · 486edfb1

由 Jason Gunthorpe 提交于 8月 13, 2018

Even though this interface is marked CONFIG_BROKEN we still expect it to
compile, at least until we delete it completely.

Also mark INFINIBAND_USER_ACCESS_UCM with COMPILE_TEST so these situations
can be detected.

Fixes: e7ff98ae ("RDMA/cma: Constify path record, ib_cm_event, listen_id pointers")
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

486edfb1

13 8月, 2018 5 次提交

IB/uverbs: Do not check for device disassociation during ioctl · 4ce719f8

由 Jason Gunthorpe 提交于 8月 09, 2018

Now that the ioctl path and uobjects are converted to use uverbs_api, it
is now safe to remove the disassociation protection from the common ioctl
code.

This completes the work to make destroy functions continue to work even
after device disassociation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

4ce719f8

J
IB/uverbs: Remove struct uverbs_root_spec and all supporting code · 51d0a2b4
由 Jason Gunthorpe 提交于 8月 09, 2018
```
Everything now uses the uverbs_uapi data structure.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
```
51d0a2b4

IB/uverbs: Use uverbs_api to unmarshal ioctl commands · 3a863577

由 Jason Gunthorpe 提交于 8月 09, 2018

Convert the ioctl method syscall path to use the uverbs_api data
structures. The new uapi structure includes all the same information, just
in a different and more optimal way.

 - Use attr_bkey instead of 2 level radix trees for everything related to
   attributes. This includes the attribute storage, presence, and
   detection of missing mandatory attributes.
 - Avoid iterating over all attribute storage at finish, instead use
   find_first_bit with the attr_bkey to locate only those attrs that need
   cleanup.
 - Organize things to always run, and always rely on, cleanup. This
   avoids a bunch of tricky error unwind cases.
 - Locate the method using the radix tree, and locate the attributes
   using a very efficient incremental radix tree lookup
 - Use the precomputed destroy_bkey to handle uobject destruction
 - Use the precomputed allocation sizes and precomputed 'need_stack'
   to avoid maths in the fast path. This is optimal if userspace
   does not pass (many) unsupported attributes.

Overall this results in much better codegen for the attribute accessors,
everything is now stored in bitmaps or linear arrays indexed by attr_bkey.
The compiler can compute attr_bkey values at compile time for all method
attributes, meaning things like uverbs_attr_is_valid() now compile into
single instruction bit tests.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

3a863577

IB/uverbs: Use uverbs_alloc for allocations · b61815e2

由 Jason Gunthorpe 提交于 8月 09, 2018

Several handlers need temporary allocations for the life of the method,
switch them to use the uverbs_alloc allocator.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

b61815e2

IB/uverbs: Add a simple allocator to uverbs_attr_bundle · 461bb2ee

由 Jason Gunthorpe 提交于 8月 09, 2018

This is similar in spirit to devm, it keeps track of any allocations
linked to this method call and ensures they are all freed when the method
exits. Further, if there is space in the internal/onstack buffer then the
allocator will hand out that memory and avoid an expensive call to
kalloc/kfree in the syscall path.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

461bb2ee

11 8月, 2018 5 次提交

IB/uverbs: Remove the ib_uverbs_attr pointer from each attr · 6a1f444f

由 Jason Gunthorpe 提交于 8月 09, 2018

Memory in the bundle is valuable, do not waste it holding an 8 byte
pointer for the rare case of writing to a PTR_OUT. We can compute the
pointer by storing a small 1 byte array offset and the base address of the
uattr memory in the bundle private memory.

This also means we can access the kernel's copy of the ib_uverbs_attr, so
drop the copy of flags as well.

Since the uattr base should be private bundle information this also
de-inlines the already too big uverbs_copy_to inline and moves
create_udata into uverbs_ioctl.c so they can see the private struct
definition.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

6a1f444f

IB/uverbs: Provide implementation private memory for the uverbs_attr_bundle · 4b3dd2bb

由 Jason Gunthorpe 提交于 8月 09, 2018

This already existed as the anonymous 'ctx' structure, but this was not
really a useful form. Hoist this struct into bundle_priv and rework the
internal things to use it instead.

Move a bunch of the processing internal state into the priv and reduce the
excessive use of function arguments.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

4b3dd2bb

IB/uverbs: Use uverbs_api to manage the object type inside the uobject · 6b0d08f4

由 Jason Gunthorpe 提交于 8月 09, 2018

Currently the struct uverbs_obj_type stored in the ib_uobject is part of
the .rodata segment of the module that defines the object. This is a
problem if drivers define new uapi objects as we will be left with a
dangling pointer after device disassociation.

Switch the uverbs_obj_type for struct uverbs_api_object, which is
allocated memory that is part of the uverbs_api and is guaranteed to
always exist. Further this moves the 'type_class' into this memory which
means access to the IDR/FD function pointers is also guaranteed. Drivers
cannot define new types.

This makes it safe to continue to use all uobjects, including driver
defined ones, after disassociation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

6b0d08f4

IB/uverbs: Build the specs into a radix tree at runtime · 9ed3e5f4

由 Jason Gunthorpe 提交于 8月 09, 2018

This radix tree datastructure is intended to replace the 'hash' structure
used today for parsing ioctl methods during system calls. This first
commit introduces the structure and builds it from the existing .rodata
descriptions.

The so-called hash arrangement is actually a 5 level open coded radix tree.
This new version uses a 3 level radix tree built using the radix tree
library.

Overall this is much less code and much easier to build as the radix tree
API allows for dynamic modification during the building. There is a small
memory penalty to pay for this, but since the radix tree is allocated on
a per device basis, a few kb of RAM seems immaterial considering the
gained simplicity.

The radix tree is similar to the existing tree, but also has a 'attr_bkey'
concept, which is a small value'd index for each method attribute. This is
used to simplify and improve performance of everything in the next
patches.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>

9ed3e5f4

IB/uverbs: Have the core code create the uverbs_root_spec · 7d96c9b1

由 Jason Gunthorpe 提交于 8月 09, 2018

There is no reason for drivers to do this, the core code should take of
everything. The drivers will provide their information from rodata to
describe their modifications to the core's base uapi specification.

The core uses this to build up the runtime uapi for each device.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

7d96c9b1

10 8月, 2018 1 次提交

IB/uverbs: Fix reading of 32 bit flags · 922983c2

由 Jason Gunthorpe 提交于 8月 09, 2018

This is missing a zeroing of the high bits of flags, and is also not
correct for big endian machines. Properly zero extend the 32 bit flags
into the 64 bit stack variable.
Reported-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Fixes: bccd0622 ("IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language")
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>

922983c2

08 8月, 2018 1 次提交

IB/ucm: Initialize sgid request GID attribute pointer · 58796e67

由 Parav Pandit 提交于 8月 06, 2018

sgid_attr is uninitialized on the stack, initialize it to NULL.

Fixes: 39839107 ("IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *'")
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NYossi Itigin <yosefe@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

58796e67

02 8月, 2018 11 次提交

IB/uverbs: Allow all DESTROY commands to succeed after disassociate · 0f50d88a

由 Jason Gunthorpe 提交于 7月 25, 2018

The disassociate function was broken by design because it failed all
commands. This prevents userspace from calling destroy on a uobject after
it has detected a device fatal error and thus reclaiming the resources in
userspace is prevented.

This fix is now straightforward, when anything destroys a uobject that is
not the user the object remains on the IDR with a NULL context and object
pointer. All lookup locking modes other than DESTROY will fail. When the
user ultimately calls the destroy function it is simply dropped from the
IDR while any related information is returned.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0f50d88a

IB/uverbs: Do not block disassociate during write() · a9b66d64

由 Jason Gunthorpe 提交于 7月 25, 2018

Now that all the callbacks are safe to run concurrently with
disassociation this test can be eliminated. The ufile core infrastructure
becomes entirely self contained and is not sensitive to disassociation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a9b66d64

IB/uverbs: Do not pass struct ib_device to the ioctl methods · e83f0ecd

由 Jason Gunthorpe 提交于 7月 25, 2018

This does the same as the patch before, except for ioctl. The rules are
the same, but for the ioctl methods the core code handles setting up the
uobject.

- Retrieve the ib_dev from the uobject->context->device. This is
  safe under ioctl as the core has already done rdma_alloc_begin_uobject
  and so CREATE calls are entirely protected by the rwsem.
- Retrieve the ib_dev from uobject->object
- Call ib_uverbs_get_ucontext()
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e83f0ecd

IB/uverbs: Do not pass struct ib_device to the write based methods · bbd51e88

由 Jason Gunthorpe 提交于 7月 25, 2018

This is a step to get rid of the global check for disassociation. In this
model, the ib_dev is not proven to be valid by the core code and cannot be
provided to the method. Instead, every method decides if it is able to
run after disassociation and obtains the ib_dev using one of three
different approaches:

- Call srcu_dereference on the udevice's ib_dev. As before, this means
  the method cannot be called after disassociation begins.
  (eg alloc ucontext)
- Retrieve the ib_dev from the ucontext, via ib_uverbs_get_ucontext()
- Retrieve the ib_dev from the uobject->object after checking
  under SRCU if disassociation has started (eg uobj_get)

Largely, the code is all ready for this, the main work is to provide a
ib_dev after calling uobj_alloc(). The few other places simply use
ib_uverbs_get_ucontext() to get the ib_dev.

This flexibility will let the next patches allow destroy to operate
after disassociation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

bbd51e88

IB/uverbs: Lower the test for ongoing disassociation · cc2e14e6

由 Jason Gunthorpe 提交于 7月 25, 2018

Commands that are reading/writing to objects can test for an ongoing
disassociation during their initial call to rdma_lookup_get_uobject.  This
directly prevents all of these commands from conflicting with an ongoing
disassociation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

cc2e14e6

IB/uverbs: Allow uobject allocation to work concurrently with disassociate · 1e857e65

由 Jason Gunthorpe 提交于 7月 25, 2018

After all the recent structural changes this is now straightforward, hold
the hw_destroy_rwsem across the entire uobject creation. We already take
this semaphore on the success path, so holding it a bit longer is not
going to change the performance.

After this change none of the create callbacks require the
disassociate_srcu lock to be correct.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

1e857e65

IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate · 7452a3c7

由 Jason Gunthorpe 提交于 7月 25, 2018

After all the recent structural changes this is now straightfoward, hoist
the hw_destroy_rwsem up out of rdma_destroy_explicit and wrap it around
the uobject write lock as well as the destroy.

This is necessary as obtaining a write lock concurrently with
uverbs_destroy_ufile_hw() will cause malfunction.

After this change none of the destroy callbacks require the
disassociate_srcu lock to be correct.

This requires introducing a new lookup mode, UVERBS_LOOKUP_DESTROY as the
IOCTL interface needs to hold an unlocked kref until all command
verification is completed.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7452a3c7

IB/uverbs: Convert 'bool exclusive' into an enum · 9867f5c6

由 Jason Gunthorpe 提交于 7月 25, 2018

This is more readable, and future patches will need a 3rd lookup type.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

9867f5c6

IB/uverbs: Consolidate uobject destruction · 87ad80ab

由 Jason Gunthorpe 提交于 7月 25, 2018

There are several flows that can destroy a uobject and each one is
minimized and sprinkled throughout the code base, making it difficult to
understand and very hard to modify the destroy path.

Consolidate all of these into uverbs_destroy_uobject() and call it in all
cases where a uobject has to be destroyed.

This makes one change to the lifecycle, during any abort (eg when
alloc_commit is not called) we always call out to alloc_abort, even if
remove_commit needs to be called to delete a HW object.

This also renames RDMA_REMOVE_DURING_CLEANUP to RDMA_REMOVE_ABORT to
clarify its actual usage and revises some of the comments to reflect what
the life cycle is for the type implementation.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

87ad80ab

IB/uverbs: Make the write path destroy methods use the same flow as ioctl · 32ed5c00

由 Jason Gunthorpe 提交于 7月 25, 2018

The ridiculous dance with uobj_remove_commit() is not needed, the write
path can follow the same flow as ioctl - lock and destroy the HW object
then use the data left over in the uobject to form the response to
userspace.

Two helpers are introduced to make this flow straightforward for the
caller.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

32ed5c00

IB/uverbs: Remove rdma_explicit_destroy() from the ioctl methods · aa72c9a5

由 Jason Gunthorpe 提交于 7月 26, 2018

The core code will destroy the HW object on behalf of the method, if the
method provides an implementation it must simply copy data from the stub
uobj into the response. Destroy methods cannot touch the HW object.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

aa72c9a5

31 7月, 2018 3 次提交

RDMA/core: Prefix _ib to IB/RoCE specific functions · 85463316

由 Parav Pandit 提交于 7月 29, 2018

In rdma cm module, functions which are common between IB and iWarp
are named with cma_.
iWarp specific functions are prefixed with cma_iw.
IB specific functions are perfixed with cma_ib.

However some functions in request processing path didn't follow
cma_ib notion. Prefix them with _ib for better code clarity.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

85463316

RDMA/core: Simplify gid type check in cma_acquire_dev() · 79d684f0

由 Parav Pandit 提交于 7月 29, 2018

cma_add_one() initializes the default GID regardless of device type.
listen_id is bound to a device and an IP address, its GID type is
initialized by cma_acquire_dev().

Therefore a valid default GID type is always available, it is not needed
to check port type during cma_acquire_dev().

Initialize gid type of a cm id when the cm_id is created instead of
doing conditional checks during cma_acquire_dev() and trying to
initialize to 0 during _cma_attach_to_dev().
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

79d684f0

RDMA/core: Avoid holding lock while initializing fields on stack · 7582df82

由 Parav Pandit 提交于 7月 29, 2018

In various functions rdma_cm_event is zero initialized on stack using
memset() while holding lock which is not necessary.
Therefore, don't hold the lock while initializing on stack.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7582df82

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功