1. 06 5月, 2014 3 次提交
  2. 23 4月, 2014 2 次提交
    • Y
      tipc: purge tipc_net_lock lock · 7216cd94
      Ying Xue 提交于
      Now tipc routing hierarchy comprises the structures 'node', 'link'and
      'bearer'. The whole hierarchy is protected by a big read/write lock,
      tipc_net_lock, to ensure that nothing is added or removed while code
      is accessing any of these structures. Obviously the locking policy
      makes node, link and bearer components closely bound together so that
      their relationship becomes unnecessarily complex. In the worst case,
      such locking policy not only has a negative influence on performance,
      but also it's prone to lead to deadlock occasionally.
      
      In order o decouple the complex relationship between bearer and node
      as well as link, the locking policy is adjusted as follows:
      
      - Bearer level
        RTNL lock is used on update side, and RCU is used on read side.
        Meanwhile, all bearer instances including broadcast bearer are
        saved into bearer_list array.
      
      - Node and link level
        All node instances are saved into two tipc_node_list and node_htable
        lists. The two lists are protected by node_list_lock on write side,
        and they are guarded with RCU lock on read side. All members in node
        structure including link instances are protected by node spin lock.
      
      - The relationship between bearer and node
        When link accesses bearer, it first needs to find the bearer with
        its bearer identity from the bearer_list array. When bearer accesses
        node, it can iterate the node_htable hash list with the node
        address to find the corresponding node.
      
      In the new locking policy, every component has its private locking
      solution and the relationship between bearer and node is very simple,
      that is, they can find each other with node address or bearer identity
      from node_htable hash list or bearer_list array.
      
      Until now above all changes have been done, so tipc_net_lock can be
      removed safely.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Tested-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7216cd94
    • Y
      tipc: decouple the relationship between bearer and link · 7a2f7d18
      Ying Xue 提交于
      Currently on both paths of message transmission and reception, the
      read lock of tipc_net_lock must be held before bearer is accessed,
      while the write lock of tipc_net_lock has to be taken before bearer
      is configured. Although it can ensure that bearer is always valid on
      the two data paths, link and bearer is closely bound together.
      
      So as the part of effort of removing tipc_net_lock, the locking
      policy of bearer protection will be adjusted as below: on the two
      data paths, RCU is used, and on the configuration path of bearer,
      RTNL lock is applied.
      
      Now RCU just covers the path of message reception. To make it possible
      to protect the path of message transmission with RCU, link should not
      use its stored bearer pointer to access bearer, but it should use the
      bearer identity of its attached bearer as index to get bearer instance
      from bearer_list array, which can help us decouple the relationship
      between bearer and link. As a result, bearer on the path of message
      transmission can be safely protected by RCU when we access bearer_list
      array within RCU lock protection.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Tested-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a2f7d18
  3. 28 3月, 2014 2 次提交
  4. 13 3月, 2014 1 次提交
  5. 20 2月, 2014 1 次提交
    • E
      tipc: failed transmissions should return error · 63fa01c1
      Erik Hugne 提交于
      When a message could not be sent out because the destination node
      or link could not be found, the full message size is returned from
      sendmsg() as if it had been sent successfully. An application will
      then get a false indication that it's making forward progress. This
      problem has existed since the initial commit in 2.6.16.
      
      We change this to return -ENETUNREACH if the message cannot be
      delivered due to the destination node/link being unavailable. We
      also get rid of the redundant tipc_reject_msg call since freeing
      the buffer and doing a tipc_port_iovec_reject accomplishes exactly
      the same thing.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63fa01c1
  6. 19 2月, 2014 1 次提交
    • Y
      tipc: align tipc function names with common naming practice in the network · 247f0f3c
      Ying Xue 提交于
      Rename the following functions, which are shorter and more in line
      with common naming practice in the network subsystem.
      
      tipc_bclink_send_msg->tipc_bclink_xmit
      tipc_bclink_recv_pkt->tipc_bclink_rcv
      tipc_disc_recv_msg->tipc_disc_rcv
      tipc_link_send_proto_msg->tipc_link_proto_xmit
      link_recv_proto_msg->tipc_link_proto_rcv
      link_send_sections_long->tipc_link_iovec_long_xmit
      tipc_link_send_sections_fast->tipc_link_iovec_xmit_fast
      tipc_link_send_sync->tipc_link_sync_xmit
      tipc_link_recv_sync->tipc_link_sync_rcv
      tipc_link_send_buf->__tipc_link_xmit
      tipc_link_send->tipc_link_xmit
      tipc_link_send_names->tipc_link_names_xmit
      tipc_named_recv->tipc_named_rcv
      tipc_link_recv_bundle->tipc_link_bundle_rcv
      tipc_link_dup_send_queue->tipc_link_dup_queue_xmit
      link_send_long_buf->tipc_link_frag_xmit
      
      tipc_multicast->tipc_port_mcast_xmit
      tipc_port_recv_mcast->tipc_port_mcast_rcv
      tipc_port_reject_sections->tipc_port_iovec_reject
      tipc_port_recv_proto_msg->tipc_port_proto_rcv
      tipc_connect->tipc_port_connect
      __tipc_connect->__tipc_port_connect
      __tipc_disconnect->__tipc_port_disconnect
      tipc_disconnect->tipc_port_disconnect
      tipc_shutdown->tipc_port_shutdown
      tipc_port_recv_msg->tipc_port_rcv
      tipc_port_recv_sections->tipc_port_iovec_rcv
      
      release->tipc_release
      accept->tipc_accept
      bind->tipc_bind
      get_name->tipc_getname
      poll->tipc_poll
      send_msg->tipc_sendmsg
      send_packet->tipc_send_packet
      send_stream->tipc_send_stream
      recv_msg->tipc_recvmsg
      recv_stream->tipc_recv_stream
      connect->tipc_connect
      listen->tipc_listen
      shutdown->tipc_shutdown
      setsockopt->tipc_setsockopt
      getsockopt->tipc_getsockopt
      
      Above changes have no impact on current users of the functions.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      247f0f3c
  7. 17 2月, 2014 1 次提交
  8. 14 2月, 2014 14 次提交
    • J
      tipc: add node_lock protection to link lookup function · e099e86c
      Jon Paul Maloy 提交于
      In an earlier commit, ("tipc: remove links list from bearer struct")
      we described three issues that need to be pre-emptively resolved before
      we can remove tipc_net_lock. Here we resolve issue a) described in that
      commit:
      
      "a) In access method #2, we access the link before taking the
          protecting node_lock. This will not work once net_lock is gone,
          so we will have to change the access order. We will deal with
          this in a later commit in this series."
      
      Here, we change that access order, by ensuring that the function
      link_find_link() returns only a safe reference for finding
      the link, i.e., a node pointer and an index into its 'links' array,
      not the link pointer itself. We also change all callers of this
      function to first take the node lock before they can check if there
      still is a valid link pointer at the returned index. Since the
      function now returns a node pointer rather than a link pointer,
      we rename it to the more appropriate 'tipc_link_find_owner().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e099e86c
    • J
      tipc: delay delete of link when failover is needed · 7d33939f
      Jon Paul Maloy 提交于
      When a bearer is disabled, all its attached links are deleted.
      Ideally, we should do link failover to redundant links on other bearers,
      if there are any, in such cases. This would be consistent with current
      behavior when a link is reset, but not deleted. However, due to the
      complexity involved, and the (wrongly) perceived low demand for this
      feature, it was never implemented until now.
      
      We mark the doomed link for deletion with a new flag, but wait until the
      failover process is finished before we actually delete it. With the
      improved link tunnelling/failover code introduced earlier in this commit
      series, it is now easy to identify a spot in the code where the failover
      is finished and it is safe to delete the marked link. Moreover, the test
      for the flag and the deletion can be done synchronously, and outside the
      most time critical data path.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d33939f
    • J
      tipc: changes to general packet reception algorithm · a5377831
      Jon Paul Maloy 提交于
      We change the order of checking for destination users when processing
      incoming packets. By placing the checks for users that may potentially
      replace the processed buffer, i.e., CHANGEOVER_PROTOCOL and
      MSG_FRAGMENTER, in a separate step before we check for the true end
      users, we get rid of a label and a 'goto', at the same time making the
      code more comprehensible and easy to follow.
      
      This commit does not change any functionality, it is just a cosmetic
      code reshuffle.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5377831
    • J
      tipc: rename stack variables in function tipc_link_tunnel_rcv · 02842f71
      Jon Paul Maloy 提交于
      After the previous redesign of the tunnel reception algorithm and
      functions, we finalize it by renaming a couple of stack variables
      in tipc_tunnel_rcv(). This makes it more consistent with the naming
      scheme elsewhere in this part of the code.
      
      This change is purely cosmetic, with no functional changes.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02842f71
    • J
      tipc: more cleanup of tunnelling reception function · 1e9d47a9
      Jon Paul Maloy 提交于
      We simplify and slim down the code in function tipc_tunnel_rcv()
      No impact on the users of this function.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e9d47a9
    • J
      tipc: change signature of tunnelling reception function · 3bb53380
      Jon Paul Maloy 提交于
      After the earlier commits in this series related to the function
      tipc_link_tunnel_rcv(), we can now go further and simplify its
      signature.
      
      The function now consumes all DUPLICATE packets, and only returns such
      ORIGINAL packets that are ready for immediate delivery, i.e., no
      more link level protocol processing needs to be done by the caller.
      As a consequence, the the caller, tipc_rcv(), does not access the link
      pointer after call return, and it becomes unnecessary to pass a link
      pointer reference in the call. Instead, we now only pass it the tunnel
      link's owner node, which is sufficient to find the destination link for
      the tunnelled packet.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bb53380
    • J
      tipc: change reception of tunnelled failover packets · f006c9c7
      Jon Paul Maloy 提交于
      When a link is reset, and there is a redundant link available, all
      sender sockets will steer their subsequent traffic through the
      remaining link. In order to guarantee preserved packet order and
      cardinality during the transition, we tunnel the failing link's send
      queue through the remaining link before we allow any sockets to use it.
      
      In this commit, we change the algorithm for receiving failover
      ("ORIGINAL_MSG") packets in tipc_link_tunnel_rcv(), at the same time
      delegating it to a new subfuncton, tipc_link_failover_rcv(). Instead
      of directly returning an extracted inner packet to the packet reception
      loop in tipc_rcv(), we first check if it is a message fragment, in which
      case we append it to the reset link's fragment chain. If the fragment
      chain is complete, we return the whole chain instead of the individual
      buffer, eliminating any need for the tipc_rcv() loop to do reassembly of
      tunneled packets.
      
      This change makes it possible to further simplify tipc_link_tunnel_rcv(),
      as well as the calling tipc_rcv() loop. We will do that in later
      commits. It also makes it possible to identify a single spot in the code
      where we can tell that a failover procedure is finished, something that
      is useful when we are deleting links after a failover. This will also
      be done in a later commit.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f006c9c7
    • J
      tipc: change reception of tunnelled duplicate packets · 1dab3d5a
      Jon Paul Maloy 提交于
      When a second link to a destination comes up, some sender sockets will
      steer their subsequent traffic through the new link. In order to
      guarantee preserved packet order and cardinality for those sockets, we
      tunnel a duplicate of the old link's send queue through the new link
      before we open it for regular traffic. The last arriving packet copy,
      on whichever link, will be dropped at the receiving end based on the
      original sequence number, to ensure that only one copy is delivered to
      the end receiver.
      
      In this commit, we change the algorithm for receiving DUPLICATE_MSG
      packets, at the same time delegating it to a new subfunction,
      tipc_link_dup_rcv(). Instead of returning an extracted inner packet to
      the packet reception loop in tipc_rcv(), we just add it to the receiving
      (new) link's deferred packet queue. The packet will then be processed by
      that link when it receives its first non-tunneled packet, i.e., at
      latest when the changeover procedure is finished.
      
      Because tipc_link_tunnel_rcv()/tipc_link_dup_rcv() now is consuming all
      packets of type DUPLICATE_MSG, the calling tipc_rcv() function can omit
      testing for this. This in turn means that the current conditional jump
      to the label 'protocol_check' becomes redundant, and we can remove that
      label.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dab3d5a
    • Y
      tipc: remove 'links' list from tipc_bearer struct · c61dd61d
      Ying Xue 提交于
      In our ongoing effort to simplify the TIPC locking structure,
      we see a need to remove the linked list for tipc_links
      in the bearer. This can be explained as follows.
      
      Currently, we have three different ways to access a link,
      via three different lists/tables:
      
      1: Via a node hash table:
         Used by the time-critical outgoing/incoming data paths.
         (e.g. link_send_sections_fast() and tipc_recv_msg() ):
      
      grab net_lock(read)
         find node from node hash table
         grab node_lock
             select link
             grab bearer_lock
                send_msg()
             release bearer_lock
         release node lock
      release net_lock
      
      2: Via a global linked list for nodes:
         Used by configuration commands (link_cmd_set_value())
      
      grab net_lock(read)
         find node and link from global node list (using link name)
         grab node_lock
             update link
         release node lock
      release net_lock
      
      (Same locking order as above. No problem.)
      
      3: Via the bearer's linked link list:
         Used by notifications from interface (e.g. tipc_disable_bearer() )
      
      grab net_lock(write)
         grab bearer_lock
            get link ptr from bearer's link list
            get node from link
            grab node_lock
               delete link
            release node lock
         release bearer_lock
      release net_lock
      
      (Different order from above, but works because we grab the
      outer net_lock in write mode first, excluding all other access.)
      
      The first major goal in our simplification effort is to get rid
      of the "big" net_lock, replacing it with rcu-locks when accessing
      the node list and node hash array. This will come in a later patch
      series.
      
      But to get there we first need to rewrite access methods ##2 and 3,
      since removal of net_lock would introduce three major problems:
      
      a) In access method #2, we access the link before taking the
         protecting node_lock. This will not work once net_lock is gone,
         so we will have to change the access order. We will deal with
         this in a later commit in this series, "tipc: add node lock
         protection to link found by link_find_link()".
      
      b) When the outer protection from net_lock is gone, taking
         bearer_lock and node_lock in opposite order of method 1) and 2)
         will become an obvious deadlock hazard. This is fixed in the
         commit ("tipc: remove bearer_lock from tipc_bearer struct")
         later in this series.
      
      c) Similar to what is described in problem a), access method #3
         starts with using a link pointer that is unprotected by node_lock,
         in order to via that pointer find the correct node struct and
         lock it. Before we remove net_lock, this access order must be
         altered. This is what we do with this commit.
      
      We can avoid introducing problem problem c) by even here using the
      global node list to find the node, before accessing its links. When
      we loop though the node list we use the own bearer identity as search
      criteria, thus easily finding the links that are associated to the
      resetting/disabling bearer. It should be noted that although this
      method is somewhat slower than the current list traversal, it is in
      no way time critical. This is only about resetting or deleting links,
      something that must be considered relatively infrequent events.
      
      As a bonus, we can get rid of the mutual pointers between links and
      bearers. After this commit, pointer dependency go in one direction
      only: from the link to the bearer.
      
      This commit pre-empts introduction of problem c) as described above.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c61dd61d
    • Y
      tipc: redefine 'started' flag in struct link to bitmap · 135daee6
      Ying Xue 提交于
      Currently, the 'started' field in struct tipc_link represents only a
      binary state, 'started' or 'not started'. We need it to represent
      more link execution states in the coming commits in this series.
      Hence, we rename the field to 'flags', and define the current
      started/non-started state to be represented by the LSB bit of
      that field.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      135daee6
    • Y
      tipc: move code for deleting links from bearer.c to link.c · 8d8439b6
      Ying Xue 提交于
      We break out the code for deleting attached links in the
      function bearer_disable(), and define a new function named
      tipc_link_delete_list() to do this job.
      
      This commit incurs no functional changes, but makes the code of
      function bearer_disable() cleaner. It is also a preparation
      for a more important change to the bearer code, in a subsequent
      commit in this series.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d8439b6
    • Y
      tipc: move code for resetting links from bearer.c to link.c · e0ca2c30
      Ying Xue 提交于
      We break out the code for resetting attached links in the
      function tipc_reset_bearer(), and define a new function named
      tipc_link_reset_list() to do this job.
      
      This commit incurs no functional changes, but makes the code
      of function tipc_reset_bearer() cleaner. It is also a preparation
      for a more important change to the bearer code, in a subsequent
      commit in this series.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0ca2c30
    • J
      tipc: stricter behavior of message reassembly function · 03b92017
      Jon Paul Maloy 提交于
      The function tipc_link_recv_fragment(struct sk_buff **buf) currently
      leaves the value of the input buffer pointer undefined when it returns,
      except when the return code indicates that the reassembly is complete.
      This despite the fact that it always consumes the input buffer.
      
      Here, we enforce a stricter behavior by this function, ensuring that
      the returned buffer pointer is non-NULL if and only if the reassembly
      is complete. This makes it possible to test for the buffer pointer as
      criteria for successful reassembly.
      
      We also rename the function to tipc_link_frag_rcv(), which is both
      shorter and more in line with common naming practice in the network
      subsystem.
      
      Apart from the new name, these changes have no impact on current
      users of the function, but makes it more practical for use in some
      planned future commits.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03b92017
    • E
      tipc: fix message corruption bug for deferred packets · 64380a04
      Erik Hugne 提交于
      If a packet received on a link is out-of-sequence, it will be
      placed on a deferred queue and later reinserted in the receive
      path once the preceding packets have been processed. The problem
      with this is that it will be subject to the buffer adjustment from
      link_recv_buf_validate twice. The second adjustment for 20 bytes
      header space will corrupt the packet.
      
      We solve this by tagging the deferred packets and bail out from
      receive buffer validation for packets that have already been
      subjected to this.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64380a04
  9. 08 1月, 2014 4 次提交
    • J
      tipc: make link start event synchronous · 581465fa
      Jon Paul Maloy 提交于
      When a link is created we delay the start event by launching it
      to be executed later in a tasklet. As we hold all the
      necessary locks at the moment of creation, and there is no risk
      of deadlock or contention, this delay serves no purpose in the
      current code.
      
      We remove this obsolete indirection step, and the associated function
      link_start(). At the same time, we rename the function tipc_link_stop()
      to the more appropriate tipc_link_purge_queues().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      581465fa
    • J
      tipc: remove 'has_redundant_link' flag from STATE link protocol messages · b9d4c339
      Jon Paul Maloy 提交于
      The flag 'has_redundant_link' is defined only in RESET and ACTIVATE
      protocol messages. Due to an ambiguity in the protocol specification it
      is currently also transferred in STATE messages. Its value is used to
      initialize a link state variable, 'permit_changeover', which is used
      to inhibit futile link failover attempts when it is known that the
      peer node has no working links at the moment, although the local node
      may still think it has one.
      
      The fact that 'has_redundant_link' incorrectly is read from STATE
      messages has the effect that 'permit_changeover' sometimes gets a wrong
      value, and permanently blocks any links from being re-established. Such
      failures can only occur in in dual-link systems, and are extremely rare.
      This bug seems to have always been present in the code.
      
      Furthermore, since commit b4b56102
      ("tipc: Ensure both nodes recognize loss of contact between them"),
      the 'permit_changeover' field serves no purpose any more. The task of
      enforcing 'lost contact' cycles at both peer endpoints is now taken
      by a new mechanism, using the flags WAIT_NODE_DOWN and WAIT_PEER_DOWN
      in struct tipc_node to abort unnecessary failover attempts.
      
      We therefore remove the 'has_redundant_link' flag from STATE messages,
      as well as the now redundant 'permit_changeover' variable.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9d4c339
    • J
      tipc: rename functions related to link failover and improve comments · 170b3927
      Jon Paul Maloy 提交于
      The functionality related to link addition and failover is unnecessarily
      hard to understand and maintain. We try to improve this by renaming
      some of the functions, at the same time adding or improving the
      explanatory comments around them. Names such as "tipc_rcv()" etc. also
      align better with what is used in other networking components.
      
      The changes in this commit are purely cosmetic, no functional changes
      are made.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      170b3927
    • E
      tipc: correctly unlink packets from deferred packet queue · 732256b9
      Erik Hugne 提交于
      When we pull a received packet from a link's 'deferred packets' queue
      for processing, its 'next' pointer is not cleared, and still refers to
      the next packet in that queue, if any. This is incorrect, but caused
      no harm before commit 40ba3cdf ("tipc:
      message reassembly using fragment chain") was introduced. After that
      commit, it may sometimes lead to the following oops:
      
      general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Modules linked in: tipc
      CPU: 4 PID: 0 Comm: swapper/4 Tainted: G        W 3.13.0-rc2+ #6
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      task: ffff880017af4880 ti: ffff880017aee000 task.ti: ffff880017aee000
      RIP: 0010:[<ffffffff81710694>]  [<ffffffff81710694>] skb_try_coalesce+0x44/0x3d0
      RSP: 0018:ffff880016603a78  EFLAGS: 00010212
      RAX: 6b6b6b6bd6d6d6d6 RBX: ffff880013106ac0 RCX: ffff880016603ad0
      RDX: ffff880016603ad7 RSI: ffff88001223ed00 RDI: ffff880013106ac0
      RBP: ffff880016603ab8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88001223ed00
      R13: ffff880016603ad0 R14: 000000000000058c R15: ffff880012297650
      FS:  0000000000000000(0000) GS:ffff880016600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 000000000805b000 CR3: 0000000011f5d000 CR4: 00000000000006e0
      Stack:
       ffff880016603a88 ffffffff810a38ed ffff880016603aa8 ffff88001223ed00
       0000000000000001 ffff880012297648 ffff880016603b68 ffff880012297650
       ffff880016603b08 ffffffffa0006c51 ffff880016603b08 00ffffffa00005fc
      Call Trace:
       <IRQ>
       [<ffffffff810a38ed>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffffa0006c51>] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
       [<ffffffffa0007214>] tipc_recv_msg+0x4e4/0x920 [tipc]
       [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
       [<ffffffffa000177c>] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
       [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
       [<ffffffff8171e65b>] __netif_receive_skb_core+0x80b/0xd00
       [<ffffffff8171df94>] ? __netif_receive_skb_core+0x144/0xd00
       [<ffffffff8171eb76>] __netif_receive_skb+0x26/0x70
       [<ffffffff8171ed6d>] netif_receive_skb+0x2d/0x200
       [<ffffffff8171fe70>] napi_gro_receive+0xb0/0x130
       [<ffffffff815647c2>] e1000_clean_rx_irq+0x2c2/0x530
       [<ffffffff81565986>] e1000_clean+0x266/0x9c0
       [<ffffffff81985f7b>] ? notifier_call_chain+0x2b/0x160
       [<ffffffff8171f971>] net_rx_action+0x141/0x310
       [<ffffffff81051c1b>] __do_softirq+0xeb/0x480
       [<ffffffff819817bb>] ? _raw_spin_unlock+0x2b/0x40
       [<ffffffff810b8c42>] ? handle_fasteoi_irq+0x72/0x100
       [<ffffffff81052346>] irq_exit+0x96/0xc0
       [<ffffffff8198cbc3>] do_IRQ+0x63/0xe0
       [<ffffffff81981def>] common_interrupt+0x6f/0x6f
       <EOI>
      
      This happens when the last fragment of a message has passed through the
      the receiving link's 'deferred packets' queue, and at least one other
      packet was added to that queue while it was there. After the fragment
      chain with the complete message has been successfully delivered to the
      receiving socket, it is released. Since 'next' pointer of the last
      fragment in the released chain now is non-NULL, we get the crash shown
      above.
      
      We fix this by clearing the 'next' pointer of all received packets,
      including those being pulled from the 'deferred' queue, before they
      undergo any further processing.
      
      Fixes: 40ba3cdf ("tipc: message reassembly using fragment chain")
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reported-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      732256b9
  10. 05 1月, 2014 1 次提交
  11. 11 12月, 2013 2 次提交
  12. 10 12月, 2013 1 次提交
    • E
      tipc: remove interface state mirroring in bearer · 512137ee
      Erik Hugne 提交于
      struct 'tipc_bearer' is a generic representation of the underlying
      media type, and exists in a one-to-one relationship to each interface
      TIPC is using. The struct contains a 'blocked' flag that mirrors the
      operational and execution state of the represented interface, and is
      updated through notification calls from the latter. The users of
      tipc_bearer are checking this flag before each attempt to send a
      packet via the interface.
      
      This state mirroring serves no purpose in the current code base. TIPC
      links will not discover a media failure any faster through this
      mechanism, and in reality the flag only adds overhead at packet
      sending and reception.
      
      Furthermore, the fact that the flag needs to be protected by a spinlock
      aggregated into tipc_bearer has turned out to cause a serious and
      completely unnecessary deadlock problem.
      
      CPU0                                    CPU1
      ----                                    ----
      Time 0: bearer_disable()                link_timeout()
      Time 1:   spin_lock_bh(&b_ptr->lock)      tipc_link_push_queue()
      Time 2:   tipc_link_delete()                tipc_bearer_blocked(b_ptr)
      Time 3:     k_cancel_timer(&req->timer)       spin_lock_bh(&b_ptr->lock)
      Time 4:       del_timer_sync(&req->timer)
      
      I.e., del_timer_sync() on CPU0 never returns, because the timer handler
      on CPU1 is waiting for the bearer lock.
      
      We eliminate the 'blocked' flag from struct tipc_bearer, along with all
      tests on this flag. This not only resolves the deadlock, but also
      simplifies and speeds up the data path execution of TIPC. It also fits
      well into our ongoing effort to make the locking policy simpler and
      more manageable.
      
      An effect of this change is that we can get rid of functions such as
      tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer().
      We replace the latter with a new function, tipc_reset_bearer(), which
      resets all links associated to the bearer immediately after an
      interface goes down.
      
      A user might notice one slight change in link behaviour after this
      change. When an interface goes down, (e.g. through a NETDEV_DOWN
      event) all attached links will be reset immediately, instead of
      leaving it to each link to detect the failure through a timer-driven
      mechanism. We consider this an improvement, and see no obvious risks
      with the new behavior.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <Paul.Gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      512137ee
  13. 15 11月, 2013 1 次提交
  14. 08 11月, 2013 3 次提交
    • E
      tipc: reassembly failures should cause link reset · a715b49e
      Erik Hugne 提交于
      If appending a received fragment to the pending fragment chain
      in a unicast link fails, the current code tries to force a retransmission
      of the fragment by decrementing the 'next received sequence number'
      field in the link. This is done under the assumption that the failure
      is caused by an out-of-memory situation, an assumption that does
      not hold true after the previous patch in this series.
      
      A failure to append a fragment can now only be caused by a protocol
      violation by the sending peer, and it must hence be assumed that it
      is either malicious or buggy.  Either way, the correct behavior is now
      to reset the link instead of trying to revert its sequence number.
      So, this is what we do in this commit.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a715b49e
    • E
      tipc: message reassembly using fragment chain · 40ba3cdf
      Erik Hugne 提交于
      When the first fragment of a long data data message is received on a link, a
      reassembly buffer large enough to hold the data from this and all subsequent
      fragments of the message is allocated. The payload of each new fragment is
      copied into this buffer upon arrival. When the last fragment is received, the
      reassembled message is delivered upwards to the port/socket layer.
      
      Not only is this an inefficient approach, but it may also cause bursts of
      reassembly failures in low memory situations. since we may fail to allocate
      the necessary large buffer in the first place. Furthermore, after 100 subsequent
      such failures the link will be reset, something that in reality aggravates the
      situation.
      
      To remedy this problem, this patch introduces a different approach. Instead of
      allocating a big reassembly buffer, we now append the arriving fragments
      to a reassembly chain on the link, and deliver the whole chain up to the
      socket layer once the last fragment has been received. This is safe because
      the retransmission layer of a TIPC link always delivers packets in strict
      uninterrupted order, to the reassembly layer as to all other upper layers.
      Hence there can never be more than one fragment chain pending reassembly at
      any given time in a link, and we can trust (but still verify) that the
      fragments will be chained up in the correct order.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40ba3cdf
    • E
      tipc: don't reroute message fragments · 528f6f4b
      Erik Hugne 提交于
      When a message fragment is received in a broadcast or unicast link,
      the reception code will append the fragment payload to a big reassembly
      buffer through a call to the function tipc_recv_fragm(). However, after
      the return of that call, the logics goes on and passes the fragment
      buffer to the function tipc_net_route_msg(), which will simply drop it.
      This behavior is a remnant from the now obsolete multi-cluster
      functionality, and has no relevance in the current code base.
      
      Although currently harmless, this unnecessary call would be fatal
      after applying the next patch in this series, which introduces
      a completely new reassembly algorithm. So we change the code to
      eliminate the redundant call.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      528f6f4b
  15. 31 10月, 2013 1 次提交
    • Y
      tipc: remove two indentation levels in tipc_recv_msg routine · 3af390e2
      Ying Xue 提交于
      The message dispatching part of tipc_recv_msg() is wrapped layers of
      while/if/if/switch, causing out-of-control indentation and does not
      look very good. We reduce two indentation levels by separating the
      message dispatching from the blocks that checks link state and
      sequence numbers, allowing longer function and arg names to be
      consistently indented without wrapping. Additionally we also rename
      "cont" label to "discard" and add one new label called "unlock_discard"
      to make code clearer. In all, these are cosmetic changes that do not
      alter the operation of TIPC in any way.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Cc: David Laight <david.laight@aculab.com>
      Cc: Andreas Bofjäll <andreas.bofjall@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3af390e2
  16. 19 10月, 2013 2 次提交