1. 14 2月, 2014 15 次提交
    • J
      tipc: add node_lock protection to link lookup function · e099e86c
      Jon Paul Maloy 提交于
      In an earlier commit, ("tipc: remove links list from bearer struct")
      we described three issues that need to be pre-emptively resolved before
      we can remove tipc_net_lock. Here we resolve issue a) described in that
      commit:
      
      "a) In access method #2, we access the link before taking the
          protecting node_lock. This will not work once net_lock is gone,
          so we will have to change the access order. We will deal with
          this in a later commit in this series."
      
      Here, we change that access order, by ensuring that the function
      link_find_link() returns only a safe reference for finding
      the link, i.e., a node pointer and an index into its 'links' array,
      not the link pointer itself. We also change all callers of this
      function to first take the node lock before they can check if there
      still is a valid link pointer at the returned index. Since the
      function now returns a node pointer rather than a link pointer,
      we rename it to the more appropriate 'tipc_link_find_owner().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e099e86c
    • Y
      tipc: remove bearer_lock from tipc_bearer struct · a8304529
      Ying Xue 提交于
      After the earlier commits ("tipc: remove 'links' list from
      tipc_bearer struct") and ("tipc: introduce new spinlock to protect
      struct link_req"), there is no longer any need to protect struct
      link_req or or any link list by use of bearer_lock. Furthermore,
      we have eliminated the need for using bearer_lock during downcalls
      (send) from the link to the bearer, since we have ensured that
      bearers always have a longer life cycle that their associated links,
      and always contain valid data.
      
      So, the only need now for a lock protecting bearers is for guaranteeing
      consistency of the bearer list itself. For this, it is sufficient, at
      least for the time being, to continue applying 'net_lock´ in write mode.
      
      By removing bearer_lock we also pre-empt introduction of issue b) descibed
      in the previous commit "tipc: remove 'links' list from tipc_bearer struct":
      
      "b) When the outer protection from net_lock is gone, taking
          bearer_lock and node_lock in opposite order of method 1) and 2)
          will become an obvious deadlock hazard".
      
      Therefore, we now eliminate the bearer_lock spinlock.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8304529
    • J
      tipc: delay delete of link when failover is needed · 7d33939f
      Jon Paul Maloy 提交于
      When a bearer is disabled, all its attached links are deleted.
      Ideally, we should do link failover to redundant links on other bearers,
      if there are any, in such cases. This would be consistent with current
      behavior when a link is reset, but not deleted. However, due to the
      complexity involved, and the (wrongly) perceived low demand for this
      feature, it was never implemented until now.
      
      We mark the doomed link for deletion with a new flag, but wait until the
      failover process is finished before we actually delete it. With the
      improved link tunnelling/failover code introduced earlier in this commit
      series, it is now easy to identify a spot in the code where the failover
      is finished and it is safe to delete the marked link. Moreover, the test
      for the flag and the deletion can be done synchronously, and outside the
      most time critical data path.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d33939f
    • J
      tipc: changes to general packet reception algorithm · a5377831
      Jon Paul Maloy 提交于
      We change the order of checking for destination users when processing
      incoming packets. By placing the checks for users that may potentially
      replace the processed buffer, i.e., CHANGEOVER_PROTOCOL and
      MSG_FRAGMENTER, in a separate step before we check for the true end
      users, we get rid of a label and a 'goto', at the same time making the
      code more comprehensible and easy to follow.
      
      This commit does not change any functionality, it is just a cosmetic
      code reshuffle.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5377831
    • J
      tipc: rename stack variables in function tipc_link_tunnel_rcv · 02842f71
      Jon Paul Maloy 提交于
      After the previous redesign of the tunnel reception algorithm and
      functions, we finalize it by renaming a couple of stack variables
      in tipc_tunnel_rcv(). This makes it more consistent with the naming
      scheme elsewhere in this part of the code.
      
      This change is purely cosmetic, with no functional changes.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02842f71
    • J
      tipc: more cleanup of tunnelling reception function · 1e9d47a9
      Jon Paul Maloy 提交于
      We simplify and slim down the code in function tipc_tunnel_rcv()
      No impact on the users of this function.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e9d47a9
    • J
      tipc: change signature of tunnelling reception function · 3bb53380
      Jon Paul Maloy 提交于
      After the earlier commits in this series related to the function
      tipc_link_tunnel_rcv(), we can now go further and simplify its
      signature.
      
      The function now consumes all DUPLICATE packets, and only returns such
      ORIGINAL packets that are ready for immediate delivery, i.e., no
      more link level protocol processing needs to be done by the caller.
      As a consequence, the the caller, tipc_rcv(), does not access the link
      pointer after call return, and it becomes unnecessary to pass a link
      pointer reference in the call. Instead, we now only pass it the tunnel
      link's owner node, which is sufficient to find the destination link for
      the tunnelled packet.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bb53380
    • J
      tipc: change reception of tunnelled failover packets · f006c9c7
      Jon Paul Maloy 提交于
      When a link is reset, and there is a redundant link available, all
      sender sockets will steer their subsequent traffic through the
      remaining link. In order to guarantee preserved packet order and
      cardinality during the transition, we tunnel the failing link's send
      queue through the remaining link before we allow any sockets to use it.
      
      In this commit, we change the algorithm for receiving failover
      ("ORIGINAL_MSG") packets in tipc_link_tunnel_rcv(), at the same time
      delegating it to a new subfuncton, tipc_link_failover_rcv(). Instead
      of directly returning an extracted inner packet to the packet reception
      loop in tipc_rcv(), we first check if it is a message fragment, in which
      case we append it to the reset link's fragment chain. If the fragment
      chain is complete, we return the whole chain instead of the individual
      buffer, eliminating any need for the tipc_rcv() loop to do reassembly of
      tunneled packets.
      
      This change makes it possible to further simplify tipc_link_tunnel_rcv(),
      as well as the calling tipc_rcv() loop. We will do that in later
      commits. It also makes it possible to identify a single spot in the code
      where we can tell that a failover procedure is finished, something that
      is useful when we are deleting links after a failover. This will also
      be done in a later commit.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f006c9c7
    • J
      tipc: change reception of tunnelled duplicate packets · 1dab3d5a
      Jon Paul Maloy 提交于
      When a second link to a destination comes up, some sender sockets will
      steer their subsequent traffic through the new link. In order to
      guarantee preserved packet order and cardinality for those sockets, we
      tunnel a duplicate of the old link's send queue through the new link
      before we open it for regular traffic. The last arriving packet copy,
      on whichever link, will be dropped at the receiving end based on the
      original sequence number, to ensure that only one copy is delivered to
      the end receiver.
      
      In this commit, we change the algorithm for receiving DUPLICATE_MSG
      packets, at the same time delegating it to a new subfunction,
      tipc_link_dup_rcv(). Instead of returning an extracted inner packet to
      the packet reception loop in tipc_rcv(), we just add it to the receiving
      (new) link's deferred packet queue. The packet will then be processed by
      that link when it receives its first non-tunneled packet, i.e., at
      latest when the changeover procedure is finished.
      
      Because tipc_link_tunnel_rcv()/tipc_link_dup_rcv() now is consuming all
      packets of type DUPLICATE_MSG, the calling tipc_rcv() function can omit
      testing for this. This in turn means that the current conditional jump
      to the label 'protocol_check' becomes redundant, and we can remove that
      label.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dab3d5a
    • Y
      tipc: remove 'links' list from tipc_bearer struct · c61dd61d
      Ying Xue 提交于
      In our ongoing effort to simplify the TIPC locking structure,
      we see a need to remove the linked list for tipc_links
      in the bearer. This can be explained as follows.
      
      Currently, we have three different ways to access a link,
      via three different lists/tables:
      
      1: Via a node hash table:
         Used by the time-critical outgoing/incoming data paths.
         (e.g. link_send_sections_fast() and tipc_recv_msg() ):
      
      grab net_lock(read)
         find node from node hash table
         grab node_lock
             select link
             grab bearer_lock
                send_msg()
             release bearer_lock
         release node lock
      release net_lock
      
      2: Via a global linked list for nodes:
         Used by configuration commands (link_cmd_set_value())
      
      grab net_lock(read)
         find node and link from global node list (using link name)
         grab node_lock
             update link
         release node lock
      release net_lock
      
      (Same locking order as above. No problem.)
      
      3: Via the bearer's linked link list:
         Used by notifications from interface (e.g. tipc_disable_bearer() )
      
      grab net_lock(write)
         grab bearer_lock
            get link ptr from bearer's link list
            get node from link
            grab node_lock
               delete link
            release node lock
         release bearer_lock
      release net_lock
      
      (Different order from above, but works because we grab the
      outer net_lock in write mode first, excluding all other access.)
      
      The first major goal in our simplification effort is to get rid
      of the "big" net_lock, replacing it with rcu-locks when accessing
      the node list and node hash array. This will come in a later patch
      series.
      
      But to get there we first need to rewrite access methods ##2 and 3,
      since removal of net_lock would introduce three major problems:
      
      a) In access method #2, we access the link before taking the
         protecting node_lock. This will not work once net_lock is gone,
         so we will have to change the access order. We will deal with
         this in a later commit in this series, "tipc: add node lock
         protection to link found by link_find_link()".
      
      b) When the outer protection from net_lock is gone, taking
         bearer_lock and node_lock in opposite order of method 1) and 2)
         will become an obvious deadlock hazard. This is fixed in the
         commit ("tipc: remove bearer_lock from tipc_bearer struct")
         later in this series.
      
      c) Similar to what is described in problem a), access method #3
         starts with using a link pointer that is unprotected by node_lock,
         in order to via that pointer find the correct node struct and
         lock it. Before we remove net_lock, this access order must be
         altered. This is what we do with this commit.
      
      We can avoid introducing problem problem c) by even here using the
      global node list to find the node, before accessing its links. When
      we loop though the node list we use the own bearer identity as search
      criteria, thus easily finding the links that are associated to the
      resetting/disabling bearer. It should be noted that although this
      method is somewhat slower than the current list traversal, it is in
      no way time critical. This is only about resetting or deleting links,
      something that must be considered relatively infrequent events.
      
      As a bonus, we can get rid of the mutual pointers between links and
      bearers. After this commit, pointer dependency go in one direction
      only: from the link to the bearer.
      
      This commit pre-empts introduction of problem c) as described above.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c61dd61d
    • Y
      tipc: redefine 'started' flag in struct link to bitmap · 135daee6
      Ying Xue 提交于
      Currently, the 'started' field in struct tipc_link represents only a
      binary state, 'started' or 'not started'. We need it to represent
      more link execution states in the coming commits in this series.
      Hence, we rename the field to 'flags', and define the current
      started/non-started state to be represented by the LSB bit of
      that field.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      135daee6
    • Y
      tipc: move code for deleting links from bearer.c to link.c · 8d8439b6
      Ying Xue 提交于
      We break out the code for deleting attached links in the
      function bearer_disable(), and define a new function named
      tipc_link_delete_list() to do this job.
      
      This commit incurs no functional changes, but makes the code of
      function bearer_disable() cleaner. It is also a preparation
      for a more important change to the bearer code, in a subsequent
      commit in this series.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d8439b6
    • Y
      tipc: move code for resetting links from bearer.c to link.c · e0ca2c30
      Ying Xue 提交于
      We break out the code for resetting attached links in the
      function tipc_reset_bearer(), and define a new function named
      tipc_link_reset_list() to do this job.
      
      This commit incurs no functional changes, but makes the code
      of function tipc_reset_bearer() cleaner. It is also a preparation
      for a more important change to the bearer code, in a subsequent
      commit in this series.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0ca2c30
    • J
      tipc: stricter behavior of message reassembly function · 03b92017
      Jon Paul Maloy 提交于
      The function tipc_link_recv_fragment(struct sk_buff **buf) currently
      leaves the value of the input buffer pointer undefined when it returns,
      except when the return code indicates that the reassembly is complete.
      This despite the fact that it always consumes the input buffer.
      
      Here, we enforce a stricter behavior by this function, ensuring that
      the returned buffer pointer is non-NULL if and only if the reassembly
      is complete. This makes it possible to test for the buffer pointer as
      criteria for successful reassembly.
      
      We also rename the function to tipc_link_frag_rcv(), which is both
      shorter and more in line with common naming practice in the network
      subsystem.
      
      Apart from the new name, these changes have no impact on current
      users of the function, but makes it more practical for use in some
      planned future commits.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03b92017
    • A
      tipc: explicitly include core.h in addr.h · b3f0f5c3
      Andreas Bofjäll 提交于
      The inline functions in addr.h uses tipc_own_addr which is exported by
      core.h, but addr.h never actually includes it. It works because it is
      explicitly included where this is used, but it looks a bit strange.
      
      Include core.h in addr.h explicitly to make the dependency clearer.
      Signed-off-by: NAndreas Bofjäll <andreas.bofjall@ericsson.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3f0f5c3
  2. 19 1月, 2014 1 次提交
  3. 17 1月, 2014 5 次提交
    • Y
      tipc: standardize recvmsg routine · 9bbb4ecc
      Ying Xue 提交于
      Standardize the behaviour of waiting for events in TIPC recvmsg()
      so that all variables of socket or port structures are protected
      within socket lock, allowing the process of calling recvmsg() to
      be woken up at appropriate time.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bbb4ecc
    • Y
      tipc: standardize sendmsg routine of connected socket · 391a6dd1
      Ying Xue 提交于
      Standardize the behaviour of waiting for events in TIPC send_packet()
      so that all variables of socket or port structures are protected within
      socket lock, allowing the process of calling sendmsg() to be woken up
      at appropriate time.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      391a6dd1
    • Y
      tipc: standardize sendmsg routine of connectionless socket · 3f40504f
      Ying Xue 提交于
      Comparing the behaviour of how to wait for events in TIPC sendmsg()
      with other stacks, the TIPC implementation might be perceived as
      different, and sometimes even incorrect. For instance, sk_sleep()
      and tport->congested variables associated with socket are exposed
      without socket lock protection while wait_event_interruptible_timeout()
      accesses them. So standardizing it with similar implementation
      in other stacks can help us correct these errors which the process
      of calling sendmsg() cannot be woken up event if an expected event
      arrive at socket or improperly woken up although the wake condition
      doesn't match.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f40504f
    • Y
      tipc: standardize accept routine · 6398e23c
      Ying Xue 提交于
      Comparing the behaviour of how to wait for events in TIPC accept()
      with other stacks, the TIPC implementation might be perceived as
      different, and sometimes even incorrect. As sk_sleep() and
      sk->sk_receive_queue variables associated with socket are not
      protected by socket lock, the process of calling accept() may be
      woken up improperly or sometimes cannot be woken up at all. After
      standardizing it with inet_csk_wait_for_connect routine, we can
      get benefits including: avoiding 'thundering herd' phenomenon,
      adding a timeout mechanism for accept(), coping with a pending
      signal, and having sk_sleep() and sk->sk_receive_queue being
      always protected within socket lock scope and so on.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6398e23c
    • Y
      tipc: standardize connect routine · 78eb3a53
      Ying Xue 提交于
      Comparing the behaviour of how to wait for events in TIPC connect()
      with other stacks, the TIPC implementation might be perceived as
      different, and sometimes even incorrect. For instance, as both
      sock->state and sk_sleep() are directly fed to
      wait_event_interruptible_timeout() as its arguments, and socket lock
      has to be released before we call wait_event_interruptible_timeout(),
      the two variables associated with socket are exposed out of socket
      lock protection, thereby probably getting stale values so that the
      process of calling connect() cannot be woken up exactly even if
      correct event arrives or it is woken up improperly even if the wake
      condition is not satisfied in practice. Therefore, standardizing its
      behaviour with sk_stream_wait_connect routine can avoid these risks.
      
      Additionally the implementation of connect routine is simplified as a
      whole, allowing it to return correct values in all different cases.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78eb3a53
  4. 15 1月, 2014 1 次提交
  5. 08 1月, 2014 5 次提交
    • J
      tipc: make link start event synchronous · 581465fa
      Jon Paul Maloy 提交于
      When a link is created we delay the start event by launching it
      to be executed later in a tasklet. As we hold all the
      necessary locks at the moment of creation, and there is no risk
      of deadlock or contention, this delay serves no purpose in the
      current code.
      
      We remove this obsolete indirection step, and the associated function
      link_start(). At the same time, we rename the function tipc_link_stop()
      to the more appropriate tipc_link_purge_queues().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      581465fa
    • Y
      tipc: introduce new spinlock to protect struct link_req · f9a2c80b
      Ying Xue 提交于
      Currently, only 'bearer_lock' is used to protect struct link_req in
      the function disc_timeout(). This is unsafe, since the member fields
      'num_nodes' and 'timer_intv' might be accessed by below three different
      threads simultaneously, none of them grabbing bearer_lock in the
      critical region:
      
      link_activate()
        tipc_bearer_add_dest()
          tipc_disc_add_dest()
            req->num_nodes++;
      
      tipc_link_reset()
        tipc_bearer_remove_dest()
          tipc_disc_remove_dest()
            req->num_nodes--
            disc_update()
              read req->num_nodes
      	write req->timer_intv
      
      disc_timeout()
        read req->num_nodes
        read/write req->timer_intv
      
      Without lock protection, the only symptom of a race is that discovery
      messages occasionally may not be sent out. This is not fatal, since such
      messages are best-effort anyway. On the other hand, since discovery
      messages are not time critical, adding a protecting lock brings no
      serious overhead either. So we add a new, dedicated spinlock in
      order to guarantee absolute data consistency in link_req objects.
      This also helps reduce the overall role of the bearer_lock, which
      we want to remove completely in a later commit series.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9a2c80b
    • J
      tipc: remove 'has_redundant_link' flag from STATE link protocol messages · b9d4c339
      Jon Paul Maloy 提交于
      The flag 'has_redundant_link' is defined only in RESET and ACTIVATE
      protocol messages. Due to an ambiguity in the protocol specification it
      is currently also transferred in STATE messages. Its value is used to
      initialize a link state variable, 'permit_changeover', which is used
      to inhibit futile link failover attempts when it is known that the
      peer node has no working links at the moment, although the local node
      may still think it has one.
      
      The fact that 'has_redundant_link' incorrectly is read from STATE
      messages has the effect that 'permit_changeover' sometimes gets a wrong
      value, and permanently blocks any links from being re-established. Such
      failures can only occur in in dual-link systems, and are extremely rare.
      This bug seems to have always been present in the code.
      
      Furthermore, since commit b4b56102
      ("tipc: Ensure both nodes recognize loss of contact between them"),
      the 'permit_changeover' field serves no purpose any more. The task of
      enforcing 'lost contact' cycles at both peer endpoints is now taken
      by a new mechanism, using the flags WAIT_NODE_DOWN and WAIT_PEER_DOWN
      in struct tipc_node to abort unnecessary failover attempts.
      
      We therefore remove the 'has_redundant_link' flag from STATE messages,
      as well as the now redundant 'permit_changeover' variable.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9d4c339
    • J
      tipc: rename functions related to link failover and improve comments · 170b3927
      Jon Paul Maloy 提交于
      The functionality related to link addition and failover is unnecessarily
      hard to understand and maintain. We try to improve this by renaming
      some of the functions, at the same time adding or improving the
      explanatory comments around them. Names such as "tipc_rcv()" etc. also
      align better with what is used in other networking components.
      
      The changes in this commit are purely cosmetic, no functional changes
      are made.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      170b3927
    • E
      tipc: correctly unlink packets from deferred packet queue · 732256b9
      Erik Hugne 提交于
      When we pull a received packet from a link's 'deferred packets' queue
      for processing, its 'next' pointer is not cleared, and still refers to
      the next packet in that queue, if any. This is incorrect, but caused
      no harm before commit 40ba3cdf ("tipc:
      message reassembly using fragment chain") was introduced. After that
      commit, it may sometimes lead to the following oops:
      
      general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Modules linked in: tipc
      CPU: 4 PID: 0 Comm: swapper/4 Tainted: G        W 3.13.0-rc2+ #6
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      task: ffff880017af4880 ti: ffff880017aee000 task.ti: ffff880017aee000
      RIP: 0010:[<ffffffff81710694>]  [<ffffffff81710694>] skb_try_coalesce+0x44/0x3d0
      RSP: 0018:ffff880016603a78  EFLAGS: 00010212
      RAX: 6b6b6b6bd6d6d6d6 RBX: ffff880013106ac0 RCX: ffff880016603ad0
      RDX: ffff880016603ad7 RSI: ffff88001223ed00 RDI: ffff880013106ac0
      RBP: ffff880016603ab8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88001223ed00
      R13: ffff880016603ad0 R14: 000000000000058c R15: ffff880012297650
      FS:  0000000000000000(0000) GS:ffff880016600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 000000000805b000 CR3: 0000000011f5d000 CR4: 00000000000006e0
      Stack:
       ffff880016603a88 ffffffff810a38ed ffff880016603aa8 ffff88001223ed00
       0000000000000001 ffff880012297648 ffff880016603b68 ffff880012297650
       ffff880016603b08 ffffffffa0006c51 ffff880016603b08 00ffffffa00005fc
      Call Trace:
       <IRQ>
       [<ffffffff810a38ed>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffffa0006c51>] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
       [<ffffffffa0007214>] tipc_recv_msg+0x4e4/0x920 [tipc]
       [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
       [<ffffffffa000177c>] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
       [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
       [<ffffffff8171e65b>] __netif_receive_skb_core+0x80b/0xd00
       [<ffffffff8171df94>] ? __netif_receive_skb_core+0x144/0xd00
       [<ffffffff8171eb76>] __netif_receive_skb+0x26/0x70
       [<ffffffff8171ed6d>] netif_receive_skb+0x2d/0x200
       [<ffffffff8171fe70>] napi_gro_receive+0xb0/0x130
       [<ffffffff815647c2>] e1000_clean_rx_irq+0x2c2/0x530
       [<ffffffff81565986>] e1000_clean+0x266/0x9c0
       [<ffffffff81985f7b>] ? notifier_call_chain+0x2b/0x160
       [<ffffffff8171f971>] net_rx_action+0x141/0x310
       [<ffffffff81051c1b>] __do_softirq+0xeb/0x480
       [<ffffffff819817bb>] ? _raw_spin_unlock+0x2b/0x40
       [<ffffffff810b8c42>] ? handle_fasteoi_irq+0x72/0x100
       [<ffffffff81052346>] irq_exit+0x96/0xc0
       [<ffffffff8198cbc3>] do_IRQ+0x63/0xe0
       [<ffffffff81981def>] common_interrupt+0x6f/0x6f
       <EOI>
      
      This happens when the last fragment of a message has passed through the
      the receiving link's 'deferred packets' queue, and at least one other
      packet was added to that queue while it was there. After the fragment
      chain with the complete message has been successfully delivered to the
      receiving socket, it is released. Since 'next' pointer of the last
      fragment in the released chain now is non-NULL, we get the crash shown
      above.
      
      We fix this by clearing the 'next' pointer of all received packets,
      including those being pulled from the 'deferred' queue, before they
      undergo any further processing.
      
      Fixes: 40ba3cdf ("tipc: message reassembly using fragment chain")
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Reported-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      732256b9
  6. 05 1月, 2014 2 次提交
  7. 02 1月, 2014 1 次提交
  8. 30 12月, 2013 1 次提交
    • Y
      tipc: fix deadlock during socket release · 84602761
      Ying Xue 提交于
      A deadlock might occur if name table is withdrawn in socket release
      routine, and while packets are still being received from bearer.
      
             CPU0                       CPU1
      T0:   recv_msg()               release()
      T1:   tipc_recv_msg()          tipc_withdraw()
      T2:   [grab node lock]         [grab port lock]
      T3:   tipc_link_wakeup_ports() tipc_nametbl_withdraw()
      T4:   [grab port lock]*        named_cluster_distribute()
      T5:   wakeupdispatch()         tipc_link_send()
      T6:                            [grab node lock]*
      
      The opposite order of holding port lock and node lock on above two
      different paths may result in a deadlock. If socket lock instead of
      port lock is used to protect port instance in tipc_withdraw(), the
      reverse order of holding port lock and node lock will be eliminated,
      as a result, the deadlock is killed as well.
      Reported-by: NLars Everbrand <lars.everbrand@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84602761
  9. 17 12月, 2013 4 次提交
  10. 11 12月, 2013 5 次提交