1. 08 8月, 2018 1 次提交
    • Y
      tipc: fix an interrupt unsafe locking scenario · 37436d9c
      Ying Xue 提交于
      Commit 9faa89d4 ("tipc: make function tipc_net_finalize() thread
      safe") tries to make it thread safe to set node address, so it uses
      node_list_lock lock to serialize the whole process of setting node
      address in tipc_net_finalize(). But it causes the following interrupt
      unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        rht_deferred_worker()
        rhashtable_rehash_table()
        lock(&(&ht->lock)->rlock)
      			       tipc_nl_compat_doit()
                                     tipc_net_finalize()
                                     local_irq_disable();
                                     lock(&(&tn->node_list_lock)->rlock);
                                     tipc_sk_reinit()
                                     rhashtable_walk_enter()
                                     lock(&(&ht->lock)->rlock);
        <Interrupt>
        tipc_disc_rcv()
        tipc_node_check_dest()
        tipc_node_create()
        lock(&(&tn->node_list_lock)->rlock);
      
       *** DEADLOCK ***
      
      When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
      disable BH. So if an interrupt happens after the lock, it can create
      an inverse lock ordering between ht->lock and tn->node_list_lock. As
      a consequence, deadlock might happen.
      
      The reason causing the inverse lock ordering scenario above is because
      the initial purpose of node_list_lock is not designed to do the
      serialization of node address setting.
      
      As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
      we use it to replace node_list_lock to ensure setting node address can
      be atomically finished. It turns out the potential deadlock can be
      avoided as well.
      
      Fixes: 9faa89d4 ("tipc: make function tipc_net_finalize() thread safe")
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <maloy@donjonn.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37436d9c
  2. 07 7月, 2018 1 次提交
  3. 17 4月, 2018 1 次提交
  4. 01 4月, 2018 1 次提交
    • J
      tipc: permit overlapping service ranges in name table · 37922ea4
      Jon Maloy 提交于
      With the new RB tree structure for service ranges it becomes possible to
      solve an old problem; - we can now allow overlapping service ranges in
      the table.
      
      When inserting a new service range to the tree, we use 'lower' as primary
      key, and when necessary 'upper' as secondary key.
      
      Since there may now be multiple service ranges matching an indicated
      'lower' value, we must also add the 'upper' value to the functions
      used for removing publications, so that the correct, corresponding
      range item can be found.
      
      These changes guarantee that a well-formed publication/withdrawal item
      from a peer node never will be rejected, and make it possible to
      eliminate the problematic backlog functionality we currently have for
      handling such cases.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37922ea4
  5. 24 3月, 2018 5 次提交
    • J
      tipc: handle collisions of 32-bit node address hash values · 25b0b9c4
      Jon Maloy 提交于
      When a 32-bit node address is generated from a 128-bit identifier,
      there is a risk of collisions which must be discovered and handled.
      
      We do this as follows:
      - We don't apply the generated address immediately to the node, but do
        instead initiate a 1 sec trial period to allow other cluster members
        to discover and handle such collisions.
      
      - During the trial period the node periodically sends out a new type
        of message, DSC_TRIAL_MSG, using broadcast or emulated broadcast,
        to all the other nodes in the cluster.
      
      - When a node is receiving such a message, it must check that the
        presented 32-bit identifier either is unused, or was used by the very
        same peer in a previous session. In both cases it accepts the request
        by not responding to it.
      
      - If it finds that the same node has been up before using a different
        address, it responds with a DSC_TRIAL_FAIL_MSG containing that
        address.
      
      - If it finds that the address has already been taken by some other
        node, it generates a new, unused address and returns it to the
        requester.
      
      - During the trial period the requesting node must always be prepared
        to accept a failure message, i.e., a message where a peer suggests a
        different (or equal)  address to the one tried. In those cases it
        must apply the suggested value as trial address and restart the trial
        period.
      
      This algorithm ensures that in the vast majority of cases a node will
      have the same address before and after a reboot. If a legacy user
      configures the address explicitly, there will be no trial period and
      messages, so this protocol addition is completely backwards compatible.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25b0b9c4
    • J
      tipc: add 128-bit node identifier · d50ccc2d
      Jon Maloy 提交于
      We add a 128-bit node identity, as an alternative to the currently used
      32-bit node address.
      
      For the sake of compatibility and to minimize message header changes
      we retain the existing 32-bit address field. When not set explicitly by
      the user, this field will be filled with a hash value generated from the
      much longer node identity, and be used as a shorthand value for the
      latter.
      
      We permit either the address or the identity to be set by configuration,
      but not both, so when the address value is set by a legacy user the
      corresponding 128-bit node identity is generated based on the that value.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d50ccc2d
    • J
      tipc: remove direct accesses to own_addr field in struct tipc_net · 23fd3eac
      Jon Maloy 提交于
      As a preparation to changing the addressing structure of TIPC we replace
      all direct accesses to the tipc_net::own_addr field with the function
      dedicated for this, tipc_own_addr().
      
      There are no changes to program logics in this commit.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23fd3eac
    • J
      tipc: allow closest-first lookup algorithm when legacy address is configured · b89afb11
      Jon Maloy 提交于
      The removal of an internal structure of the node address has an unwanted
      side effect.
      - Currently, if a user is sending an anycast message with destination
        domain 0, the tipc_namebl_translate() function will use the 'closest-
        first' algorithm to first look for a node local destination, and only
        when no such is found, will it resort to the cluster global 'round-
        robin' lookup algorithm.
      - Current users can get around this, and enforce unconditional use of
        global round-robin by indicating a destination as Z.0.0 or Z.C.0.
      - This option disappears when we make the node address flat, since the
        lookup algorithm has no way of recognizing this case. So, as long as
        there are node local destinations, the algorithm will always select
        one of those, and there is nothing the sender can do to change this.
      
      We solve this by eliminating the 'closest-first' option, which was never
      a good idea anyway, for non-legacy users, but only for those. To
      distinguish between legacy users and non-legacy users we introduce a new
      flag 'legacy_addr_format' in struct tipc_core, to be set when the user
      configures a legacy-style Z.C.N node address. Hence, when a legacy user
      indicates a zero lookup domain 'closest-first' is selected, and in all
      other cases we use 'round-robin'.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b89afb11
    • J
      tipc: remove restrictions on node address values · 20263641
      Jon Maloy 提交于
      Nominally, TIPC organizes network nodes into a three-level network
      hierarchy consisting of the levels 'zone', 'cluster' and 'node'. This
      hierarchy is reflected in the node address format, - it is sub-divided
      into an 8-bit zone id, and 12 bit cluster id, and a 12-bit node id.
      
      However, the 'zone' and 'cluster' levels have in reality never been
      fully implemented,and never will be. The result of this has been
      that the first 20 bits the node identity structure have been wasted,
      and the usable node identity range within a cluster has been limited
      to 12 bits. This is starting to become a problem.
      
      In the following commits, we will need to be able to connect between
      nodes which are using the whole 32-bit value space of the node address.
      We therefore remove the restrictions on which values can be assigned
      to node identity, -it is from now on only a 32-bit integer with no
      assumed internal structure.
      
      Isolation between clusters is now achieved only by setting different
      values for the 'network id' field used during neighbor discovery, in
      practice leading to the latter becoming the new cluster identity.
      
      The rules for accepting discovery requests/responses from neighboring
      nodes now become:
      
      - If the user is using legacy address format on both peers, reception
        of discovery messages is subject to the legacy lookup domain check
        in addition to the cluster id check.
      
      - Otherwise, the discovery request/response is always accepted, provided
        both peers have the same network id.
      
      This secures backwards compatibility for users who have been using zone
      or cluster identities as cluster separators, instead of the intended
      'network id'.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20263641
  6. 18 3月, 2018 1 次提交
    • J
      tipc: obsolete TIPC_ZONE_SCOPE · 928df188
      Jon Maloy 提交于
      Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all
      aspects handled the same way, both on the publishing node and on the
      receiving nodes.
      
      Despite previous ambitions to the contrary, this is never going to change,
      so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related
      macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt
      using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we
      remain compatible with users and remote nodes still using ZONE_SCOPE.
      
      Furthermore, the non-formalized scope value 0 has always been permitted
      for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE.
      We now permit it even as binding scope, but for compatibility reasons we
      choose to not change the value of TIPC_CLUSTER_SCOPE.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      928df188
  7. 15 2月, 2018 1 次提交
  8. 14 4月, 2017 2 次提交
    • J
      netlink: pass extended ACK struct where available · fe52145f
      Johannes Berg 提交于
      This is an add-on to the previous patch that passes the extended ACK
      structure where it's already available by existing genl_info or extack
      function arguments.
      
      This was done with this spatch (with some manual adjustment of
      indentation):
      
      @@
      expression A, B, C, D, E;
      identifier fn, info;
      @@
      fn(..., struct genl_info *info, ...) {
      ...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, info->extack)
      ...
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, info;
      @@
      fn(..., struct genl_info *info, ...) {
      <...
      -nla_parse_nested(A, B, C, D, NULL)
      +nla_parse_nested(A, B, C, D, info->extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_parse(A, B, C, D, E, NULL)
      +nla_parse(A, B, C, D, E, extack)
      ...>
      }
      
      @@
      expression A, B, C, D, E;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      ...
      -nlmsg_parse(A, B, C, D, E, NULL)
      +nlmsg_parse(A, B, C, D, E, extack)
      ...
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_parse_nested(A, B, C, D, NULL)
      +nla_parse_nested(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nlmsg_validate(A, B, C, D, NULL)
      +nlmsg_validate(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C, D;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_validate(A, B, C, D, NULL)
      +nla_validate(A, B, C, D, extack)
      ...>
      }
      
      @@
      expression A, B, C;
      identifier fn, extack;
      @@
      fn(..., struct netlink_ext_ack *extack, ...) {
      <...
      -nla_validate_nested(A, B, C, NULL)
      +nla_validate_nested(A, B, C, extack)
      ...>
      }
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe52145f
    • J
      netlink: pass extended ACK struct to parsing functions · fceb6435
      Johannes Berg 提交于
      Pass the new extended ACK reporting struct to all of the generic
      netlink parsing functions. For now, pass NULL in almost all callers
      (except for some in the core.)
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fceb6435
  9. 18 2月, 2017 1 次提交
  10. 16 2月, 2017 1 次提交
  11. 14 2月, 2017 1 次提交
  12. 08 3月, 2016 1 次提交
  13. 07 3月, 2016 1 次提交
  14. 24 10月, 2015 2 次提交
    • J
      tipc: create broadcast transmission link at namespace init · 5fd9fd63
      Jon Paul Maloy 提交于
      The broadcast transmission link is currently instantiated when the
      network subsystem is started, i.e., on order from user space via netlink.
      
      This forces the broadcast transmission code to do unnecessary tests for
      the existence of the transmission link, as well in single mode node as
      in network mode.
      
      In this commit, we do instead create the link during initialization of
      the name space, and remove it when it is stopped. The fact that the
      transmission link now has a guaranteed longer life cycle than any of its
      potential clients paves the way for further code simplifcations
      and optimizations.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fd9fd63
    • J
      tipc: move bcast definitions to bcast.c · 6beb19a6
      Jon Paul Maloy 提交于
      Currently, a number of structure and function definitions related
      to the broadcast functionality are unnecessarily exposed in the file
      bcast.h. This obscures the fact that the external interface towards
      the broadcast link in fact is very narrow, and causes unnecessary
      recompilations of other files when anything changes in those
      definitions.
      
      In this commit, we move as many of those definitions as is currently
      possible to the file bcast.c.
      
      We also rename the structure 'tipc_bclink' to 'tipc_bc_base', both
      since the name does not correctly describe the contents of this
      struct, and will do so even less in the future, and because we want
      to use the term 'link' more appropriately in the functionality
      introduced later in this series.
      
      Finally, we rename a couple of functions, such as tipc_bclink_xmit()
      and others that will be kept in the future, to include the term 'bcast'
      instead.
      
      There are no functional changes in this commit.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6beb19a6
  15. 15 5月, 2015 1 次提交
    • J
      tipc: simplify include dependencies · a6bf70f7
      Jon Paul Maloy 提交于
      When we try to add new inline functions in the code, we sometimes
      run into circular include dependencies.
      
      The main problem is that the file core.h, which really should be at
      the root of the dependency chain, instead is a leaf. I.e., core.h
      includes a number of header files that themselves should be allowed
      to include core.h. In reality this is unnecessary, because core.h does
      not need to know the full signature of any of the structs it refers to,
      only their type declaration.
      
      In this commit, we remove all dependencies from core.h towards any
      other tipc header file.
      
      As a consequence of this change, we can now move the function
      tipc_own_addr(net) from addr.c to addr.h, and make it inline.
      
      There are no functional changes in this commit.
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6bf70f7
  16. 10 2月, 2015 3 次提交
  17. 13 1月, 2015 7 次提交
  18. 22 11月, 2014 2 次提交
  19. 24 8月, 2014 2 次提交
  20. 28 6月, 2014 2 次提交
  21. 15 5月, 2014 1 次提交
  22. 06 5月, 2014 1 次提交
  23. 23 4月, 2014 1 次提交
    • Y
      tipc: purge tipc_net_lock lock · 7216cd94
      Ying Xue 提交于
      Now tipc routing hierarchy comprises the structures 'node', 'link'and
      'bearer'. The whole hierarchy is protected by a big read/write lock,
      tipc_net_lock, to ensure that nothing is added or removed while code
      is accessing any of these structures. Obviously the locking policy
      makes node, link and bearer components closely bound together so that
      their relationship becomes unnecessarily complex. In the worst case,
      such locking policy not only has a negative influence on performance,
      but also it's prone to lead to deadlock occasionally.
      
      In order o decouple the complex relationship between bearer and node
      as well as link, the locking policy is adjusted as follows:
      
      - Bearer level
        RTNL lock is used on update side, and RCU is used on read side.
        Meanwhile, all bearer instances including broadcast bearer are
        saved into bearer_list array.
      
      - Node and link level
        All node instances are saved into two tipc_node_list and node_htable
        lists. The two lists are protected by node_list_lock on write side,
        and they are guarded with RCU lock on read side. All members in node
        structure including link instances are protected by node spin lock.
      
      - The relationship between bearer and node
        When link accesses bearer, it first needs to find the bearer with
        its bearer identity from the bearer_list array. When bearer accesses
        node, it can iterate the node_htable hash list with the node
        address to find the corresponding node.
      
      In the new locking policy, every component has its private locking
      solution and the relationship between bearer and node is very simple,
      that is, they can find each other with node address or bearer identity
      from node_htable hash list or bearer_list array.
      
      Until now above all changes have been done, so tipc_net_lock can be
      removed safely.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Tested-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7216cd94