1. 15 3月, 2015 3 次提交
  2. 06 2月, 2015 3 次提交
    • J
      tipc: resolve race problem at unicast message reception · c637c103
      Jon Paul Maloy 提交于
      TIPC handles message cardinality and sequencing at the link layer,
      before passing messages upwards to the destination sockets. During the
      upcall from link to socket no locks are held. It is therefore possible,
      and we see it happen occasionally, that messages arriving in different
      threads and delivered in sequence still bypass each other before they
      reach the destination socket. This must not happen, since it violates
      the sequentiality guarantee.
      
      We solve this by adding a new input buffer queue to the link structure.
      Arriving messages are added safely to the tail of that queue by the
      link, while the head of the queue is consumed, also safely, by the
      receiving socket. Sequentiality is secured per socket by only allowing
      buffers to be dequeued inside the socket lock. Since there may be multiple
      simultaneous readers of the queue, we use a 'filter' parameter to reduce
      the risk that they peek the same buffer from the queue, hence also
      reducing the risk of contention on the receiving socket locks.
      
      This solves the sequentiality problem, and seems to cause no measurable
      performance degradation.
      
      A nice side effect of this change is that lock handling in the functions
      tipc_rcv() and tipc_bcast_rcv() now becomes uniform, something that
      will enable future simplifications of those functions.
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c637c103
    • J
      tipc: split up function tipc_msg_eval() · e3a77561
      Jon Paul Maloy 提交于
      The function tipc_msg_eval() is in reality doing two related, but
      different tasks. First it tries to find a new destination for named
      messages, in case there was no first lookup, or if the first lookup
      failed. Second, it does what its name suggests, evaluating the validity
      of the message and its destination, and returning an appropriate error
      code depending on the result.
      
      This is confusing, and in this commit we choose to break it up into two
      functions. A new function, tipc_msg_lookup_dest(), first attempts to find
      a new destination, if the message is of the right type. If this lookup
      fails, or if the message should not be subject to a second lookup, the
      already existing tipc_msg_reverse() is called. This function performs
      prepares the message for rejection, if applicable.
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3a77561
    • J
      tipc: reduce usage of context info in socket and link · c5898636
      Jon Paul Maloy 提交于
      The most common usage of namespace information is when we fetch the
      own node addess from the net structure. This leads to a lot of
      passing around of a parameter of type 'struct net *' between
      functions just to make them able to obtain this address.
      
      However, in many cases this is unnecessary. The own node address
      is readily available as a member of both struct tipc_sock and
      tipc_link, and can be fetched from there instead.
      The fact that the vast majority of functions in socket.c and link.c
      anyway are maintaining a pointer to their respective base structures
      makes this option even more compelling.
      
      In this commit, we introduce the inline functions tsk_own_node()
      and link_own_node() to make it easy for functions to fetch the node
      address from those structs instead of having to pass along and
      dereference the namespace struct.
      
      In particular, we make calls to the msg_xx() functions in msg.{h,c}
      context independent by directly passing them the own node address
      as parameter when needed. Those functions should be regarded as
      leaves in the code dependency tree, and it is hence desirable to
      keep them namspace unaware.
      
      Apart from a potential positive effect on cache behavior, these
      changes make it easier to introduce the changes that will follow
      later in this series.
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5898636
  3. 04 2月, 2015 1 次提交
  4. 13 1月, 2015 4 次提交
  5. 10 12月, 2014 1 次提交
    • A
      put iov_iter into msghdr · c0371da6
      Al Viro 提交于
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  6. 27 11月, 2014 3 次提交
  7. 24 11月, 2014 1 次提交
  8. 31 10月, 2014 1 次提交
  9. 24 8月, 2014 2 次提交
    • J
      tipc: use pseudo message to wake up sockets after link congestion · 50100a5e
      Jon Paul Maloy 提交于
      The current link implementation keeps a linked list of blocked ports/
      sockets that is populated when there is link congestion. The purpose
      of this is to let the link know which users to wake up when the
      congestion abates.
      
      This adds unnecessary complexity to the data structure and the code,
      since it forces us to involve the link each time we want to delete
      a socket. It also forces us to grab the spinlock port_lock within
      the scope of node_lock. We want to get rid of this direct dependence,
      as well as the deadlock hazard resulting from the usage of port_lock.
      
      In this commit, we instead let the link keep list of a "wakeup" pseudo
      messages for use in such situations. Those messages are sent to the
      pending sockets via the ordinary message reception path, and wake up
      the socket's owner when they are received.
      
      This enables us to get rid of the 'waiting_ports' linked lists in struct
      tipc_port that manifest this direct reference. As a consequence, we can
      eliminate another BH entry into the socket, and hence the need to grab
      port_lock. This is a further step in our effort to remove port_lock
      altogether.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50100a5e
    • J
      tipc: introduce new function tipc_msg_create() · 1dd0bd2b
      Jon Paul Maloy 提交于
      The function tipc_msg_init() has turned out to be of limited value
      in many cases. It take too few parameters to be usable for creating
      a complete message, it makes too many assumptions about what the
      message should be used for, and it does not allocate any buffer to
      be returned to the caller.
      
      Therefore, we now introduce the new function tipc_msg_create(), which
      takes all the parameters needed to create a full message, and returns
      a buffer of the requested size. The new function will be very useful
      for the changes we will be doing in later commits in this series.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dd0bd2b
  10. 29 7月, 2014 1 次提交
    • J
      tipc: make tipc_buf_append() more robust · 13e9b997
      Jon Paul Maloy 提交于
      As per comment from David Miller, we try to make the buffer reassembly
      function more resilient to user errors than it is today.
      
      - We check that the "*buf" parameter always is set, since this is
        mandatory input.
      
      - We ensure that *buf->next always is set to NULL before linking in
        the buffer, instead of relying of the caller to have done this.
      
      - We ensure that the "tail" pointer in the head buffer's control
        block is initialized to NULL when the first fragment arrives.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13e9b997
  11. 17 7月, 2014 3 次提交
  12. 09 7月, 2014 1 次提交
    • J
      tipc: fix bug in multicast/broadcast message reassembly · 29322d0d
      Jon Paul Maloy 提交于
      Since commit 37e22164 ("tipc: rename and
      move message reassembly function") reassembly of long broadcast messages
      has been broken. This is because we test for a non-NULL return value
      of the *buf parameter as criteria for succesful reassembly. However, this
      parameter is left defined even after reception of the first fragment,
      when reassebly is still incomplete. This leads to a kernel crash as soon
      as a the first fragment of a long broadcast message is received.
      
      We fix this with this commit, by implementing a stricter behavior of the
      function and its return values.
      
      This commit should be applied to both net and net-next.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29322d0d
  13. 28 6月, 2014 5 次提交
    • J
      tipc: clean up connection protocol reception function · ac0074ee
      Jon Paul Maloy 提交于
      We simplify the code for receiving connection probes, leveraging the
      recently introduced tipc_msg_reverse() function. We also stick to
      the principle of sending a possible response message directly from
      the calling (tipc_sk_rcv or backlog_rcv) functions, hence making
      the call chain shallower and easier to follow.
      
      We make one small protocol change here, allowed according to
      the spec. If a protocol message arrives from a remote socket that
      is not the one we are connected to, we are currently generating a
      connection abort message and send it to the source. This behavior
      is unnecessary, and might even be a security risk, so instead we
      now choose to only ignore the message. The consequnce for the sender
      is that he will need longer time to discover his mistake (until the
      next timeout), but this is an extreme corner case, and may happen
      anyway under other circumstances, so we deem this change acceptable.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac0074ee
    • J
      tipc: introduce message evaluation function · 5a379074
      Jon Paul Maloy 提交于
      When a message arrives in a node and finds no destination
      socket, we may need to drop it, reject it, or forward it after
      a secondary destination lookup. The latter two cases currently
      results in a code path that is perceived as complex, because it
      follows a deep call chain via obscure functions such as
      net_route_named_msg() and net_route_msg().
      
      We now introduce a function, tipc_msg_eval(), that takes the
      decision about whether such a message should be rejected or
      forwarded, but leaves it to the caller to actually perform
      the indicated action.
      
      If the decision is 'reject', it is still the task of the recently
      introduced function tipc_msg_reverse() to take the final decision
      about whether the message is rejectable or not. In the latter case
      it drops the message.
      
      As a result of this change, we can finally eliminate the function
      net_route_named_msg(), and hence become independent of net_route_msg().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a379074
    • J
      tipc: separate building and sending of rejected messages · 8db1bae3
      Jon Paul Maloy 提交于
      The way we build and send rejected message is currenty perceived as
      hard to follow, partly because we let the transmission go via deep
      call chains through functions such as tipc_reject_msg() and
      net_route_msg().
      
      We want to remove those functions, and make the call sequences shallower
      and simpler. For this purpose, we separate building and sending of
      rejected messages. We build the reject message using the new function
      tipc_msg_reverse(), and let the transmission go via the newly introduced
      tipc_link_xmit2() function, as all transmission eventually will do. We
      also ensure that all calls to tipc_link_xmit2() are made outside
      port_lock/bh_lock_sock.
      
      Finally, we replace all calls to tipc_reject_msg() with the two new
      calls at all locations in the code that we want to keep. The remaining
      calls are made from code that we are planning to remove, along with
      tipc_reject_msg() itself.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8db1bae3
    • J
      tipc: introduce direct iovec to buffer chain fragmentation function · 067608e9
      Jon Paul Maloy 提交于
      Fragmentation at message sending is currently performed in two
      places in link.c, depending on whether data to be transmitted
      is delivered in the form of an iovec or as a big sk_buff. Those
      functions are also tightly entangled with the send functions
      that are using them.
      
      We now introduce a re-entrant, standalone function, tipc_msg_build2(),
      that builds a packet chain directly from an iovec. Each fragment is
      sized according to the MTU value given by the caller, and is prepended
      with a correctly built fragment header, when needed. The function is
      independent from who is calling and where the chain will be delivered,
      as long as the caller is able to indicate a correct MTU.
      
      The function is tested, but not called by anybody yet. Since it is
      incompatible with the existing tipc_msg_build(), and we cannot yet
      remove that function, we have given it a temporary name.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      067608e9
    • J
      tipc: introduce send functions for chained buffers in link · 4f1688b2
      Jon Paul Maloy 提交于
      The current link implementation provides several different transmit
      functions, depending on the characteristics of the message to be
      sent: if it is an iovec or an sk_buff, if it needs fragmentation or
      not, if the caller holds the node_lock or not. The permutation of
      these options gives us an unwanted amount of unnecessarily complex
      code.
      
      As a first step towards simplifying the send path for all messages,
      we introduce two new send functions at link level, tipc_link_xmit2()
      and __tipc_link_xmit2(). The former looks up a link to the message
      destination, and if one is found, it grabs the node lock and calls
      the second function, which works exclusively inside the node lock
      protection. If no link is found, and the destination is on the same
      node, it delivers the message directly to the local destination
      socket.
      
      The new functions take a buffer chain where all packet headers are
      already prepared, and the correct MTU has been used. These two
      functions will later replace all other link-level transmit functions.
      
      The functions are not backwards compatible, so we have added them
      as new functions with temporary names. They are tested, but have no
      users yet. Those will be added later in this series.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f1688b2
  14. 15 5月, 2014 1 次提交
  15. 19 10月, 2013 2 次提交
  16. 18 6月, 2013 2 次提交
  17. 14 7月, 2012 1 次提交
  18. 01 5月, 2012 1 次提交
    • P
      tipc: compress out gratuitous extra carriage returns · 617d3c7a
      Paul Gortmaker 提交于
      Some of the comment blocks are floating in limbo between two
      functions, or between blocks of code.  Delete the extra line
      feeds between any comment and its associated following block
      of code, to be consistent with the majority of the rest of
      the kernel.  Also delete trailing newlines at EOF and fix
      a couple trivial typos in existing comments.
      
      This is a 100% cosmetic change with no runtime impact.  We get
      rid of over 500 lines of non-code, and being blank line deletes,
      they won't even show up as noise in git blame.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      617d3c7a
  19. 25 2月, 2012 1 次提交
  20. 28 12月, 2011 1 次提交
  21. 25 6月, 2011 1 次提交
  22. 11 5月, 2011 1 次提交
    • A
      tipc: Avoid recomputation of outgoing message length · 26896904
      Allan Stephens 提交于
      Rework TIPC's message sending routines to take advantage of the total
      amount of data value passed to it by the kernel socket infrastructure.
      This change eliminates the need for TIPC to compute the size of outgoing
      messages itself, as well as the check for an oversize message in
      tipc_msg_build().  In addition, this change warrants an explanation:
      
         -     res = send_packet(NULL, sock, &my_msg, 0);
         +     res = send_packet(NULL, sock, &my_msg, bytes_to_send);
      
      Previously, the final argument to send_packet() was ignored (since the
      amount of data being sent was recalculated by a lower-level routine)
      and we could just pass in a dummy value (0). Now that the
      recalculation is being eliminated, the argument value being passed to
      send_packet() is significant and we have to supply the actual amount
      of data we want to send.
      Signed-off-by: NAllan Stephens <Allan.Stephens@windriver.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      26896904