1. 13 8月, 2018 3 次提交
    • J
      IB/uverbs: Remove struct uverbs_root_spec and all supporting code · 51d0a2b4
      Jason Gunthorpe 提交于
      Everything now uses the uverbs_uapi data structure.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      51d0a2b4
    • J
      IB/uverbs: Use uverbs_api to unmarshal ioctl commands · 3a863577
      Jason Gunthorpe 提交于
      Convert the ioctl method syscall path to use the uverbs_api data
      structures. The new uapi structure includes all the same information, just
      in a different and more optimal way.
      
       - Use attr_bkey instead of 2 level radix trees for everything related to
         attributes. This includes the attribute storage, presence, and
         detection of missing mandatory attributes.
       - Avoid iterating over all attribute storage at finish, instead use
         find_first_bit with the attr_bkey to locate only those attrs that need
         cleanup.
       - Organize things to always run, and always rely on, cleanup. This
         avoids a bunch of tricky error unwind cases.
       - Locate the method using the radix tree, and locate the attributes
         using a very efficient incremental radix tree lookup
       - Use the precomputed destroy_bkey to handle uobject destruction
       - Use the precomputed allocation sizes and precomputed 'need_stack'
         to avoid maths in the fast path. This is optimal if userspace
         does not pass (many) unsupported attributes.
      
      Overall this results in much better codegen for the attribute accessors,
      everything is now stored in bitmaps or linear arrays indexed by attr_bkey.
      The compiler can compute attr_bkey values at compile time for all method
      attributes, meaning things like uverbs_attr_is_valid() now compile into
      single instruction bit tests.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3a863577
    • J
      IB/uverbs: Add a simple allocator to uverbs_attr_bundle · 461bb2ee
      Jason Gunthorpe 提交于
      This is similar in spirit to devm, it keeps track of any allocations
      linked to this method call and ensures they are all freed when the method
      exits. Further, if there is space in the internal/onstack buffer then the
      allocator will hand out that memory and avoid an expensive call to
      kalloc/kfree in the syscall path.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      461bb2ee
  2. 11 8月, 2018 3 次提交
    • J
      IB/uverbs: Remove the ib_uverbs_attr pointer from each attr · 6a1f444f
      Jason Gunthorpe 提交于
      Memory in the bundle is valuable, do not waste it holding an 8 byte
      pointer for the rare case of writing to a PTR_OUT. We can compute the
      pointer by storing a small 1 byte array offset and the base address of the
      uattr memory in the bundle private memory.
      
      This also means we can access the kernel's copy of the ib_uverbs_attr, so
      drop the copy of flags as well.
      
      Since the uattr base should be private bundle information this also
      de-inlines the already too big uverbs_copy_to inline and moves
      create_udata into uverbs_ioctl.c so they can see the private struct
      definition.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      6a1f444f
    • J
      IB/uverbs: Provide implementation private memory for the uverbs_attr_bundle · 4b3dd2bb
      Jason Gunthorpe 提交于
      This already existed as the anonymous 'ctx' structure, but this was not
      really a useful form. Hoist this struct into bundle_priv and rework the
      internal things to use it instead.
      
      Move a bunch of the processing internal state into the priv and reduce the
      excessive use of function arguments.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      4b3dd2bb
    • J
      IB/uverbs: Build the specs into a radix tree at runtime · 9ed3e5f4
      Jason Gunthorpe 提交于
      This radix tree datastructure is intended to replace the 'hash' structure
      used today for parsing ioctl methods during system calls. This first
      commit introduces the structure and builds it from the existing .rodata
      descriptions.
      
      The so-called hash arrangement is actually a 5 level open coded radix tree.
      This new version uses a 3 level radix tree built using the radix tree
      library.
      
      Overall this is much less code and much easier to build as the radix tree
      API allows for dynamic modification during the building. There is a small
      memory penalty to pay for this, but since the radix tree is allocated on
      a per device basis, a few kb of RAM seems immaterial considering the
      gained simplicity.
      
      The radix tree is similar to the existing tree, but also has a 'attr_bkey'
      concept, which is a small value'd index for each method attribute. This is
      used to simplify and improve performance of everything in the next
      patches.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      9ed3e5f4
  3. 02 8月, 2018 1 次提交
    • J
      IB/uverbs: Do not pass struct ib_device to the ioctl methods · e83f0ecd
      Jason Gunthorpe 提交于
      This does the same as the patch before, except for ioctl. The rules are
      the same, but for the ioctl methods the core code handles setting up the
      uobject.
      
      - Retrieve the ib_dev from the uobject->context->device. This is
        safe under ioctl as the core has already done rdma_alloc_begin_uobject
        and so CREATE calls are entirely protected by the rwsem.
      - Retrieve the ib_dev from uobject->object
      - Call ib_uverbs_get_ucontext()
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      e83f0ecd
  4. 31 7月, 2018 1 次提交
    • J
      IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language · bccd0622
      Jason Gunthorpe 提交于
      This clearly indicates that the input is a bitwise combination of values
      in an enum, and identifies which enum contains the definition of the bits.
      
      Special accessors are provided that handle the mandatory validation of the
      allowed bits and enforce the correct type for bitwise flags.
      
      If we had introduced this at the start then the kabi would have uniformly
      used u64 data to pass flags, however today there is a mixture of u64 and
      u32 flags. All places are converted to accept both sizes and the accessor
      fixes it. This allows all existing flags to grow to u64 in future without
      any hassle.
      
      Finally all flags are, by definition, optional. If flags are not passed
      the accessor does not fail, but provides a value of zero.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      bccd0622
  5. 26 7月, 2018 1 次提交
    • J
      IB/uverbs: Fix locking around struct ib_uverbs_file ucontext · 22fa27fb
      Jason Gunthorpe 提交于
      We have a parallel unlocked reader and writer with ib_uverbs_get_context()
      vs everything else, and nothing guarantees this works properly.
      
      Audit and fix all of the places that access ucontext to use one of the
      following locking schemes:
      - Call ib_uverbs_get_ucontext() under SRCU and check for failure
      - Access the ucontext through an struct ib_uobject context member
        while holding a READ or WRITE lock on the uobject.
        This value cannot be NULL and has no race.
      - Hold the ucontext_lock and check for ufile->ucontext !NULL
      
      This also re-implements ib_uverbs_get_ucontext() in a way that is safe
      against concurrent ib_uverbs_get_context() and disassociation.
      
      As a side effect, every access to ucontext in the commands is via
      ib_uverbs_get_context() with an error check, or via the uobject, so there
      is no longer any need for the core code to check ucontext on every command
      call. These checks are also removed.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      22fa27fb
  6. 25 7月, 2018 1 次提交
  7. 05 7月, 2018 7 次提交
  8. 22 6月, 2018 1 次提交
  9. 20 6月, 2018 2 次提交
  10. 02 6月, 2018 1 次提交
  11. 24 5月, 2018 1 次提交
  12. 06 4月, 2018 1 次提交
  13. 05 4月, 2018 1 次提交
    • M
      IB/uverbs: Add enum attribute type to ioctl() interface · 494c5580
      Matan Barak 提交于
      Methods sometimes need to get one attribute out of a group of
      pre-defined attributes. This is an enum-like behavior. Since
      this is a common requirement, we add a new ENUM attribute to the
      generic uverbs ioctl() layer. This attribute is embedded in methods,
      like any other attributes we currently have. ENUM attributes point to
      an array of standard UVERBS_ATTR_PTR_IN. The user-space encodes the
      enum's attribute id in the id field and the internal PTR_IN attr id in
      the enum_data.elem_id field. This ENUM attribute could be shared by
      several attributes and it can get UVERBS_ATTR_SPEC_F_MANDATORY flag,
      stating this attribute must be supported by the kernel, like any other
      attribute.
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      494c5580
  14. 20 3月, 2018 4 次提交
  15. 16 2月, 2018 2 次提交
  16. 31 8月, 2017 7 次提交
    • M
      IB/core: Assign root to all drivers · 52427112
      Matan Barak 提交于
      In order to use the parsing tree, we need to assign the root
      to all drivers. Currently, we just assign the default parsing
      tree via ib_uverbs_add_one. The driver could override this by
      assigning a parsing tree prior to registering the device.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      52427112
    • M
      IB/core: Add legacy driver's user-data · d70724f1
      Matan Barak 提交于
      In this phase, we don't want to change all the drivers to use
      flexible driver's specific attributes. Therefore, we add two default
      attributes: UHW_IN and UHW_OUT. These attributes are optional in some
      methods and they encode the driver specific command data. We add
      a function that extract this data and creates the legacy udata over
      it.
      
      Driver's data should start from UVERBS_UDATA_DRIVER_DATA_FLAG. This
      turns on the first bit of the namespace, indicating this attribute
      belongs to the driver's namespace.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d70724f1
    • M
      IB/core: Add macros for declaring methods and attributes · 35410306
      Matan Barak 提交于
      This patch adds macros for declaring objects, methods and
      attributes. These definitions are later used by downstream patches
      to declare some of the default types.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      35410306
    • M
      IB/core: Add uverbs merge trees functionality · 118620d3
      Matan Barak 提交于
      Different drivers support different features and even subset of the
      common uverbs implementation. Currently, this is handled as bitmask
      in every driver that represents which kind of methods it supports, but
      doesn't go down to attributes granularity. Moreover, drivers might
      want to add their specific types, methods and attributes to let
      their user-space counter-parts be exposed to some more efficient
      abstractions. It means that existence of different features is
      validated syntactically via the parsing infrastructure rather than
      using a complex in-handler logic.
      
      In order to do that, we allow defining features and abstractions
      as parsing trees. These per-feature parsing tree could be merged
      to an efficient (perfect-hash based) parsing tree, which is later
      used by the parsing infrastructure.
      
      To sum it up, this makes a parse tree unique for a device and
      represents only the features this particular device supports.
      This is done by having a root specification tree per feature.
      Before a device registers itself as an IB device, it merges
      all these trees into one parsing tree. This parsing tree
      is used to parse all user-space commands.
      
      A future user-space application could read this parse tree. This
      tree represents which objects, methods and attributes are
      supported by this device.
      
      This is based on the idea of
      Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      118620d3
    • M
      IB/core: Add DEVICE object and root tree structure · 09e3ebf8
      Matan Barak 提交于
      This adds the DEVICE object. This object supports creating the context
      that all objects are created from. Moreover, it supports executing
      methods which are related to the device itself, such as QUERY_DEVICE.
      This is a singleton object (per file instance).
      
      All standard objects are put in the root structure. This root will later
      on be used in drivers as the source for their whole parsing tree.
      Later on, when new features are added, these drivers could mix this root
      with other customized objects.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      09e3ebf8
    • M
      IB/core: Declare an object instead of declaring only type attributes · 5009010f
      Matan Barak 提交于
      Switch all uverbs_type_attrs_xxxx with DECLARE_UVERBS_OBJECT
      macros. This will be later used in order to embed the object
      specific methods in the objects as well.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5009010f
    • M
      IB/core: Add new ioctl interface · fac9658c
      Matan Barak 提交于
      In this ioctl interface, processing the command starts from
      properties of the command and fetching the appropriate user objects
      before calling the handler.
      
      Parsing and validation is done according to a specifier declared by
      the driver's code. In the driver, all supported objects are declared.
      These objects are separated to different object namepsaces. Dividing
      objects to namespaces is done at initialization by using the higher
      bits of the object ids. This initialization can mix objects declared
      in different places to one parsing tree using in this ioctl interface.
      
      For each object we list all supported methods. Similarly to objects,
      methods are separated to method namespaces too. Namespacing is done
      similarly to the objects case. This could be used in order to add
      methods to an existing object.
      
      Each method has a specific handler, which could be either a default
      handler or a driver specific handler.
      Along with the handler, a bunch of attributes are specified as well.
      Similarly to objects and method, attributes are namespaced and hashed
      by their ids at initialization too. All supported attributes are
      subject to automatic fetching and validation. These attributes include
      the command, response and the method's related objects' ids.
      
      When these entities (objects, methods and attributes) are used, the
      high bits of the entities ids are used in order to calculate the hash
      bucket index. Then, these high bits are masked out in order to have a
      zero based index. Since we use these high bits for both bucketing and
      namespacing, we get a compact representation and O(1) array access.
      This is mandatory for efficient dispatching.
      
      Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length.
      Attributes could be validated through some attributes, like:
      (*) Minimum size / Exact size
      (*) Fops for FD
      (*) Object type for IDR
      
      If an IDR/fd attribute is specified, the kernel also states the object
      type and the required access (NEW, WRITE, READ or DESTROY).
      All uobject/fd management is done automatically by the infrastructure,
      meaning - the infrastructure will fail concurrent commands that at
      least one of them requires concurrent access (WRITE/DESTROY),
      synchronize actions with device removals (dissociate context events)
      and take care of reference counting (increase/decrease) for concurrent
      actions invocation. The reference counts on the actual kernel objects
      shall be handled by the handlers.
      
       objects
      +--------+
      |        |
      |        |   methods                                                                +--------+
      |        |   ns         method      method_spec                           +-----+   |len     |
      +--------+  +------+[d]+-------+   +----------------+[d]+------------+    |attr1+-> |type    |
      | object +> |method+-> | spec  +-> +  attr_buckets  +-> |default_chain+--> +-----+   |idr_type|
      +--------+  +------+   |handler|   |                |   +------------+    |attr2|   |access  |
      |        |  |      |   +-------+   +----------------+   |driver chain|    +-----+   +--------+
      |        |  |      |                                    +------------+
      |        |  +------+
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      +--------+
      
      [d] = Hash ids to groups using the high order bits
      
      The right types table is also chosen by using the high bits from
      the ids. Currently we have either default or driver specific groups.
      
      Once validation and object fetching (or creation) completed, we call
      the handler:
      int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
                     struct uverbs_attr_bundle *ctx);
      
      ctx bundles attributes of different namespaces. Each element there
      is an array of attributes which corresponds to one namespaces of
      attributes. For example, in the usually used case:
      
       ctx                               core
      +----------------------------+     +------------+
      | core:                      +---> | valid      |
      +----------------------------+     | cmd_attr   |
      | driver:                    |     +------------+
      |----------------------------+--+  | valid      |
                                      |  | cmd_attr   |
                                      |  +------------+
                                      |  | valid      |
                                      |  | obj_attr   |
                                      |  +------------+
                                      |
                                      |  drivers
                                      |  +------------+
                                      +> | valid      |
                                         | cmd_attr   |
                                         +------------+
                                         | valid      |
                                         | cmd_attr   |
                                         +------------+
                                         | valid      |
                                         | obj_attr   |
                                         +------------+
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      fac9658c
  17. 30 8月, 2017 2 次提交
    • M
      IB/core: Add support to finalize objects in one transaction · f43dbebf
      Matan Barak 提交于
      The new ioctl based infrastructure either commits or rollbacks
      all objects of the method as one transaction. In order to do
      that, we introduce a notion of dealing with a collection of
      objects that are related to a specific method.
      
      This also requires adding a notion of a method and attribute.
      A method contains a hash of attributes, where each bucket
      contains several attributes. The attributes are hashed according
      to their namespace which resides in the four upper bits of the id.
      
      For example, an object could be a CQ, which has an action of CREATE_CQ.
      This action has multiple attributes. For example, the CQ's new handle
      and the comp_channel. Each layer in this hierarchy - objects, methods
      and attributes is split into namespaces. The basic example for that is
      one namespace representing the default entities and another one
      representing the driver specific entities.
      
      When declaring these methods and attributes, we actually declare
      their specifications. When a method is executed, we actually
      allocates some space to hold auxiliary information. This auxiliary
      information contains meta-data about the required objects, such
      as pointers to their type information, pointers to the uobjects
      themselves (if exist), etc.
      The specification, along with the auxiliary information we allocated
      and filled is given to the finalize_objects function.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f43dbebf
    • M
      IB/core: Add a generic way to execute an operation on a uobject · a0aa309c
      Matan Barak 提交于
      The ioctl infrastructure treats all user-objects in the same manner.
      It gets objects ids from the user-space and by using the object type
      and type attributes mentioned in the object specification, it executes
      this required method. Passing an object id from the user-space as
      an attribute is carried out in three stages. The first is carried out
      before the actual handler and the last is carried out afterwards.
      
      The different supported operations are read, write, destroy and create.
      In the first stage, the former three actions just fetches the object
      from the repository (by using its id) and locks it. The last action
      allocates a new uobject. Afterwards, the second stage is carried out
      when the handler itself carries out the required modification of the
      object. The last stage is carried out after the handler finishes and
      commits the result. The former two operations just unlock the object.
      Destroy calls the "free object" operation, taking into account the
      object's type and releases the uobject as well. Creation just adds the
      new uobject to the repository, making the object visible to the
      application.
      
      In order to abstract these details from the ioctl infrastructure
      layer, we add uverbs_get_uobject_from_context and
      uverbs_finalize_object functions which corresponds to the first
      and last stages respectively.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a0aa309c
  18. 03 4月, 2015 1 次提交