1. 30 8月, 2014 1 次提交
  2. 24 8月, 2014 13 次提交
  3. 17 8月, 2014 1 次提交
  4. 30 7月, 2014 1 次提交
  5. 21 7月, 2014 1 次提交
  6. 17 7月, 2014 3 次提交
  7. 16 7月, 2014 1 次提交
  8. 09 7月, 2014 1 次提交
  9. 28 6月, 2014 8 次提交
    • J
      tipc: simplify connection congestion handling · 60120526
      Jon Paul Maloy 提交于
      As a consequence of the recently introduced serialized access
      to the socket in commit 8d94168a761819d10252bab1f8de6d7b202c3baa
      ("tipc: same receive code path for connection protocol and data
      messages") we can make a number of simplifications in the
      detection and handling of connection congestion situations.
      
      - We don't need to keep two counters, one for sent messages and one
        for acked messages. There is no longer any risk for races between
        acknowledge messages arriving in BH and data message sending
        running in user context. So we merge this into one counter,
        'sent_unacked', which is incremented at sending and subtracted
        from at acknowledge reception.
      
      - We don't need to set the 'congested' field in tipc_port to
        true before we sent the message, and clear it when sending
        is successful. (As a matter of fact, it was never necessary;
        the field was set in link_schedule_port() before any wakeup
        could arrive anyway.)
      
      - We keep the conditions for link congestion and connection connection
        congestion separated. There would otherwise be a risk that an arriving
        acknowledge message may wake up a user sleeping because of link
        congestion.
      
      - We can simplify reception of acknowledge messages.
      
      We also make some cosmetic/structural changes:
      
      - We rename the 'congested' field to the more correct 'link_cong´.
      
      - We rename 'conn_unacked' to 'rcv_unacked'
      
      - We move the above mentioned fields from struct tipc_port to
        struct tipc_sock.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60120526
    • J
      tipc: clean up connection protocol reception function · ac0074ee
      Jon Paul Maloy 提交于
      We simplify the code for receiving connection probes, leveraging the
      recently introduced tipc_msg_reverse() function. We also stick to
      the principle of sending a possible response message directly from
      the calling (tipc_sk_rcv or backlog_rcv) functions, hence making
      the call chain shallower and easier to follow.
      
      We make one small protocol change here, allowed according to
      the spec. If a protocol message arrives from a remote socket that
      is not the one we are connected to, we are currently generating a
      connection abort message and send it to the source. This behavior
      is unnecessary, and might even be a security risk, so instead we
      now choose to only ignore the message. The consequnce for the sender
      is that he will need longer time to discover his mistake (until the
      next timeout), but this is an extreme corner case, and may happen
      anyway under other circumstances, so we deem this change acceptable.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac0074ee
    • J
      tipc: same receive code path for connection protocol and data messages · ec8a2e56
      Jon Paul Maloy 提交于
      As a preparation to eliminate port_lock we need to bring reception
      of connection protocol messages under proper protection of bh_lock_sock
      or socket owner.
      
      We fix this by letting those messages follow the same code path as
      incoming data messages.
      
      As a side effect of this change, the last reference to the function
      net_route_msg() disappears, and we can eliminate that function.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec8a2e56
    • J
      tipc: connection oriented transport uses new send functions · 4ccfe5e0
      Jon Paul Maloy 提交于
      We move the message sending across established connections
      to use the message preparation and send functions introduced
      earlier in this series. We now do the message preparation
      and call to the link send function directly from the socket,
      instead of going via the port layer.
      
      As a consequence of this change, the functions tipc_send(),
      tipc_port_iovec_rcv(), tipc_port_iovec_reject() and tipc_reject_msg()
      become unreferenced and can be eliminated from port.c. For the same
      reason, the functions tipc_link_xmit_fast(), tipc_link_iovec_xmit_long()
      and tipc_link_iovec_fast() can be eliminated from link.c.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ccfe5e0
    • J
      tipc: RDM/DGRAM transport uses new fragmenting and sending functions · e2dafe87
      Jon Paul Maloy 提交于
      We merge the code for sending port name and port identity addressed
      messages into the corresponding send functions in socket.c, and start
      using the new fragmenting and transmit functions we just have introduced.
      
      This saves a call level and quite a few code lines, as well as making
      this part of the code easier to follow. As a consequence, the functions
      tipc_send2name() and tipc_send2port() in port.c can be removed.
      
      For practical reasons, we break out the code for sending multicast messages
      from tipc_sendmsg() and move it into a separate function, tipc_sendmcast(),
      but we do not yet convert it into using the new build/send functions.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2dafe87
    • J
      tipc: introduce message evaluation function · 5a379074
      Jon Paul Maloy 提交于
      When a message arrives in a node and finds no destination
      socket, we may need to drop it, reject it, or forward it after
      a secondary destination lookup. The latter two cases currently
      results in a code path that is perceived as complex, because it
      follows a deep call chain via obscure functions such as
      net_route_named_msg() and net_route_msg().
      
      We now introduce a function, tipc_msg_eval(), that takes the
      decision about whether such a message should be rejected or
      forwarded, but leaves it to the caller to actually perform
      the indicated action.
      
      If the decision is 'reject', it is still the task of the recently
      introduced function tipc_msg_reverse() to take the final decision
      about whether the message is rejectable or not. In the latter case
      it drops the message.
      
      As a result of this change, we can finally eliminate the function
      net_route_named_msg(), and hence become independent of net_route_msg().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a379074
    • J
      tipc: separate building and sending of rejected messages · 8db1bae3
      Jon Paul Maloy 提交于
      The way we build and send rejected message is currenty perceived as
      hard to follow, partly because we let the transmission go via deep
      call chains through functions such as tipc_reject_msg() and
      net_route_msg().
      
      We want to remove those functions, and make the call sequences shallower
      and simpler. For this purpose, we separate building and sending of
      rejected messages. We build the reject message using the new function
      tipc_msg_reverse(), and let the transmission go via the newly introduced
      tipc_link_xmit2() function, as all transmission eventually will do. We
      also ensure that all calls to tipc_link_xmit2() are made outside
      port_lock/bh_lock_sock.
      
      Finally, we replace all calls to tipc_reject_msg() with the two new
      calls at all locations in the code that we want to keep. The remaining
      calls are made from code that we are planning to remove, along with
      tipc_reject_msg() itself.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8db1bae3
    • J
      tipc: use negative error return values in functions · e4de5fab
      Jon Paul Maloy 提交于
      In some places, TIPC functions returns positive integers as return
      codes. This goes against standard Linux coding practice, and may
      even cause problems in some cases.
      
      We now change the return values of the functions filter_rcv()
      and filter_connect() to become signed integers, and return
      negative error codes when needed. The codes we use in these
      particular cases are still TIPC specific, since they are both
      part of the TIPC API and have no correspondence in errno.h
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4de5fab
  10. 12 6月, 2014 1 次提交
  11. 25 5月, 2014 1 次提交
  12. 15 5月, 2014 3 次提交
    • J
      tipc: merge port message reception into socket reception function · 9816f061
      Jon Paul Maloy 提交于
      In order to reduce complexity and save a call level during message
      reception at port/socket level, we remove the function tipc_port_rcv()
      and merge its functionality into tipc_sk_rcv().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9816f061
    • J
      tipc: compensate for double accounting in socket rcv buffer · 4f4482dc
      Jon Paul Maloy 提交于
      The function net/core/sock.c::__release_sock() runs a tight loop
      to move buffers from the socket backlog queue to the receive queue.
      
      As a security measure, sk_backlog.len of the receiving socket
      is not set to zero until after the loop is finished, i.e., until
      the whole backlog queue has been transferred to the receive queue.
      During this transfer, the data that has already been moved is counted
      both in the backlog queue and the receive queue, hence giving an
      incorrect picture of the available queue space for new arriving buffers.
      
      This leads to unnecessary rejection of buffers by sk_add_backlog(),
      which in TIPC leads to unnecessarily broken connections.
      
      In this commit, we compensate for this double accounting by adding
      a counter that keeps track of it. The function socket.c::backlog_rcv()
      receives buffers one by one from __release_sock(), and adds them to the
      socket receive queue. If the transfer is successful, it increases a new
      atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
      transferred buffer. If a new buffer arrives during this transfer and
      finds the socket busy (owned), we attempt to add it to the backlog.
      However, when sk_add_backlog() is called, we adjust the 'limit'
      parameter with the value of the new counter, so that the risk of
      inadvertent rejection is eliminated.
      
      It should be noted that this change does not invalidate the original
      purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
      upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
      doesn't respect the send window) keeps pumping in buffers to
      sk_add_backlog(), he will eventually reach an upper limit,
      (2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
      to the backlog, and the connection will be broken. Ordinary, well-
      behaved senders will never reach this buffer limit at all.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f4482dc
    • J
      tipc: decrease connection flow control window · 6163a194
      Jon Paul Maloy 提交于
      Memory overhead when allocating big buffers for data transfer may
      be quite significant. E.g., truesize of a 64 KB buffer turns out
      to be 132 KB, 2 x the requested size.
      
      This invalidates the "worst case" calculation we have been
      using to determine the default socket receive buffer limit,
      which is based on the assumption that 1024x64KB = 67MB buffers
      may be queued up on a socket.
      
      Since TIPC connections cannot survive hitting the buffer limit,
      we have to compensate for this overhead.
      
      We do that in this commit by dividing the fix connection flow
      control window from 1024 (2*512) messages to 512 (2*256). Since
      older version nodes send out acks at 512 message intervals,
      compatibility with such nodes is guaranteed, although performance
      may be non-optimal in such cases.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6163a194
  13. 27 4月, 2014 1 次提交
  14. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  15. 08 4月, 2014 1 次提交
  16. 13 3月, 2014 2 次提交