1. 16 11月, 2017 1 次提交
    • J
      tipc: enforce valid ratio between skb truesize and contents · d618d09a
      Jon Maloy 提交于
      The socket level flow control is based on the assumption that incoming
      buffers meet the condition (skb->truesize / roundup(skb->len) <= 4),
      where the latter value is rounded off upwards to the nearest 1k number.
      This does empirically hold true for the device drivers we know, but we
      cannot trust that it will always be so, e.g., in a system with jumbo
      frames and very small packets.
      
      We now introduce a check for this condition at packet arrival, and if
      we find it to be false, we copy the packet to a new, smaller buffer,
      where the condition will be true. We expect this to affect only a small
      fraction of all incoming packets, if at all.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d618d09a
  2. 11 11月, 2017 1 次提交
    • J
      tipc: improve link resiliency when rps is activated · 8d6e79d3
      Jon Maloy 提交于
      Currently, the TIPC RPS dissector is based only on the incoming packets'
      source node address, hence steering all traffic from a node to the same
      core. We have seen that this makes the links vulnerable to starvation
      and unnecessary resets when we turn down the link tolerance to very low
      values.
      
      To reduce the risk of this happening, we exempt probe and probe replies
      packets from the convergence to one core per source node. Instead, we do
      the opposite, - we try to diverge those packets across as many cores as
      possible, by randomizing the flow selector key.
      
      To make such packets identifiable to the dissector, we add a new
      'is_keepalive' bit to word 0 of the LINK_PROTOCOL header. This bit is
      set both for PROBE and PROBE_REPLY messages, and only for those.
      
      It should be noted that these packets are not part of any flow anyway,
      and only constitute a minuscule fraction of all packets sent across a
      link. Hence, there is no risk that this will affect overall performance.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d6e79d3
  3. 03 11月, 2017 1 次提交
    • J
      tipc: eliminate unnecessary probing · fa368826
      Jon Maloy 提交于
      The neighbor monitor employs a threshold, default set to 32 peer nodes,
      where it activates the "Overlapping Neighbor Monitoring" algorithm.
      Below that threshold, monitoring is full-mesh, and no "domain records"
      are passed between the nodes.
      
      Because of this, a node never received a peer's ack that it has received
      the most recent update of the own domain. Hence, the field 'acked_gen'
      in struct tipc_monitor_state remains permamently at zero, whereas the
      own domain generation is incremented for each added or removed peer.
      
      This has the effect that the function tipc_mon_get_state() always sets
      the field 'probing' in struct tipc_monitor_state true, again leading the
      tipc_link_timeout() of the link in question to always send out a probe,
      even when link->silent_intv_count is zero.
      
      This is functionally harmless, but leads to some unncessary probing,
      which can easily be eliminated by setting the 'probing' field of the
      said struct correctly in such cases.
      
      At the same time, we explictly invalidate the sent domain records when
      the algorithm is not activated. This will eliminate any risk that an
      invalid domain record might be inadverently accepted by the peer.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa368826
  4. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  5. 01 11月, 2017 1 次提交
  6. 26 10月, 2017 2 次提交
  7. 22 10月, 2017 1 次提交
  8. 21 10月, 2017 1 次提交
  9. 20 10月, 2017 1 次提交
  10. 17 10月, 2017 1 次提交
  11. 13 10月, 2017 18 次提交
    • J
      tipc: add multipoint-to-point flow control · 04d7b574
      Jon Maloy 提交于
      We already have point-to-multipoint flow control within a group. But
      we even need the opposite; -a scheme which can handle that potentially
      hundreds of sources may try to send messages to the same destination
      simultaneously without causing buffer overflow at the recipient. This
      commit adds such a mechanism.
      
      The algorithm works as follows:
      
      - When a member detects a new, joining member, it initially set its
        state to JOINED and advertises a minimum window to the new member.
        This window is chosen so that the new member can send exactly one
        maximum sized message, or several smaller ones, to the recipient
        before it must stop and wait for an additional advertisement. This
        minimum window ADV_IDLE is set to 65 1kB blocks.
      
      - When a member receives the first data message from a JOINED member,
        it changes the state of the latter to ACTIVE, and advertises a larger
        window ADV_ACTIVE = 12 x ADV_IDLE blocks to the sender, so it can
        continue sending with minimal disturbances to the data flow.
      
      - The active members are kept in a dedicated linked list. Each time a
        message is received from an active member, it will be moved to the
        tail of that list. This way, we keep a record of which members have
        been most (tail) and least (head) recently active.
      
      - There is a maximum number (16) of permitted simultaneous active
        senders per receiver. When this limit is reached, the receiver will
        not advertise anything immediately to a new sender, but instead put
        it in a PENDING state, and add it to a corresponding queue. At the
        same time, it will pick the least recently active member, send it an
        advertisement RECLAIM message, and set this member to state
        RECLAIMING.
      
      - The reclaimee member has to respond with a REMIT message, meaning that
        it goes back to a send window of ADV_IDLE, and returns its unused
        advertised blocks beyond that value to the reclaiming member.
      
      - When the reclaiming member receives the REMIT message, it unlinks
        the reclaimee from its active list, resets its state to JOINED, and
        notes that it is now back at ADV_IDLE advertised blocks to that
        member. If there are still unread data messages sent out by
        reclaimee before the REMIT, the member goes into an intermediate
        state REMITTED, where it stays until the said messages have been
        consumed.
      
      - The returned advertised blocks can now be re-advertised to the
        pending member, which is now set to state ACTIVE and added to
        the active member list.
      
      - To be proactive, i.e., to minimize the risk that any member will
        end up in the pending queue, we start reclaiming resources already
        when the number of active members exceeds 3/4 of the permitted
        maximum.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04d7b574
    • J
      tipc: guarantee delivery of last broadcast before DOWN event · a3bada70
      Jon Maloy 提交于
      The following scenario is possible:
      - A user sends a broadcast message, and thereafter immediately leaves
        the group.
      - The LEAVE message, following a different path than the broadcast,
        arrives ahead of the broadcast, and the sending member is removed
        from the receiver's list.
      - The broadcast message arrives, but is dropped because the sender
        now is unknown to the receipient.
      
      We fix this by sequence numbering membership events, just like ordinary
      unicast messages. Currently, when a JOIN is sent to a peer, it contains
      a synchronization point, - the sequence number of the next sent
      broadcast, in order to give the receiver a start synchronization point.
      We now let even LEAVE messages contain such an "end synchronization"
      point, so that the recipient can delay the removal of the sending member
      until it knows that all messages have been received.
      
      The received synchronization points are added as sequence numbers to the
      generated membership events, making it possible to handle them almost
      the same way as regular unicasts in the receiving filter function. In
      particular, a DOWN event with a too high sequence number will be kept
      in the reordering queue until the missing broadcast(s) arrive and have
      been delivered.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3bada70
    • J
      tipc: guarantee delivery of UP event before first broadcast · 399574d4
      Jon Maloy 提交于
      The following scenario is possible:
      - A user joins a group, and immediately sends out a broadcast message
        to its members.
      - The broadcast message, following a different data path than the
        initial JOIN message sent out during the joining procedure, arrives
        to a receiver before the latter..
      - The receiver drops the message, since it is not ready to accept any
        messages until the JOIN has arrived.
      
      We avoid this by treating group protocol JOIN messages like unicast
      messages.
      - We let them pass through the recipient's multicast input queue, just
        like ordinary unicasts.
      - We force the first following broadacst to be sent as replicated
        unicast and being acknowledged by the recipient before accepting
        any more broadcast transmissions.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      399574d4
    • J
      tipc: guarantee that group broadcast doesn't bypass group unicast · 2f487712
      Jon Maloy 提交于
      We need a mechanism guaranteeing that group unicasts sent out from a
      socket are not bypassed by later sent broadcasts from the same socket.
      We do this as follows:
      
      - Each time a unicast is sent, we set a the broadcast method for the
        socket to "replicast" and "mandatory". This forces the first
        subsequent broadcast message to follow the same network and data path
        as the preceding unicast to a destination, hence preventing it from
        overtaking the latter.
      
      - In order to make the 'same data path' statement above true, we let
        group unicasts pass through the multicast link input queue, instead
        of as previously through the unicast link input queue.
      
      - In the first broadcast following a unicast, we set a new header flag,
        requiring all recipients to immediately acknowledge its reception.
      
      - During the period before all the expected acknowledges are received,
        the socket refuses to accept any more broadcast attempts, i.e., by
        blocking or returning EAGAIN. This period should typically not be
        longer than a few microseconds.
      
      - When all acknowledges have been received, the sending socket will
        open up for subsequent broadcasts, this time giving the link layer
        freedom to itself select the best transmission method.
      
      - The forced and/or abrupt transmission method changes described above
        may lead to broadcasts arriving out of order to the recipients. We
        remedy this by introducing code that checks and if necessary
        re-orders such messages at the receiving end.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f487712
    • J
      tipc: guarantee group unicast doesn't bypass group broadcast · b87a5ea3
      Jon Maloy 提交于
      Group unicast messages don't follow the same path as broadcast messages,
      and there is a high risk that unicasts sent from a socket might bypass
      previously sent broadcasts from the same socket.
      
      We fix this by letting all unicast messages carry the sequence number of
      the next sent broadcast from the same node, but without updating this
      number at the receiver. This way, a receiver can check and if necessary
      re-order such messages before they are added to the socket receive buffer.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b87a5ea3
    • J
      tipc: introduce group multicast messaging · 5b8dddb6
      Jon Maloy 提交于
      The previously introduced message transport to all group members is
      based on the tipc multicast service, but is logically a broadcast
      service within the group, and that is what we call it.
      
      We now add functionality for sending messages to all group members
      having a certain identity. Correspondingly, we call this feature 'group
      multicast'. The service is using unicast when only one destination is
      found, otherwise it will use the bearer broadcast service to transfer
      the messages. In the latter case, the receiving members filter arriving
      messages by looking at the intended destination instance. If there is
      no match, the message will be dropped, while still being considered
      received and read as seen by the flow control mechanism.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b8dddb6
    • J
      tipc: introduce group anycast messaging · ee106d7f
      Jon Maloy 提交于
      In this commit, we make it possible to send connectionless unicast
      messages to any member corresponding to the given member identity,
      when there is more than one such member. The sender must use a
      TIPC_ADDR_NAME address to achieve this effect.
      
      We also perform load balancing between the destinations, i.e., we
      primarily select one which has advertised sufficient send window
      to not cause a block/EAGAIN delay, if any. This mechanism is
      overlayed on the always present round-robin selection.
      
      Anycast messages are subject to the same start synchronization
      and flow control mechanism as group broadcast messages.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee106d7f
    • J
      tipc: introduce group unicast messaging · 27bd9ec0
      Jon Maloy 提交于
      We now make it possible to send connectionless unicast messages
      within a communication group. To send a message, the sender can use
      either a direct port address, aka port identity, or an indirect port
      name to be looked up.
      
      This type of messages are subject to the same start synchronization
      and flow control mechanism as group broadcast messages.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27bd9ec0
    • J
      tipc: introduce flow control for group broadcast messages · b7d42635
      Jon Maloy 提交于
      We introduce an end-to-end flow control mechanism for group broadcast
      messages. This ensures that no messages are ever lost because of
      destination receive buffer overflow, with minimal impact on performance.
      For now, the algorithm is based on the assumption that there is only one
      active transmitter at any moment in time.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7d42635
    • J
      tipc: receive group membership events via member socket · ae236fb2
      Jon Maloy 提交于
      Like with any other service, group members' availability can be
      subscribed for by connecting to be topology server. However, because
      the events arrive via a different socket than the member socket, there
      is a real risk that membership events my arrive out of synch with the
      actual JOIN/LEAVE action. I.e., it is possible to receive the first
      messages from a new member before the corresponding JOIN event arrives,
      just as it is possible to receive the last messages from a leaving
      member after the LEAVE event has already been received.
      
      Since each member socket is internally also subscribing for membership
      events, we now fix this problem by passing those events on to the user
      via the member socket. We leverage the already present member synch-
      ronization protocol to guarantee correct message/event order. An event
      is delivered to the user as an empty message where the two source
      addresses identify the new/lost member. Furthermore, we set the MSG_OOB
      bit in the message flags to mark it as an event. If the event is an
      indication about a member loss we also set the MSG_EOR bit, so it can
      be distinguished from a member addition event.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae236fb2
    • J
      tipc: add second source address to recvmsg()/recvfrom() · 31c82a2d
      Jon Maloy 提交于
      With group communication, it becomes important for a message receiver to
      identify not only from which socket (identfied by a node:port tuple) the
      message was sent, but also the logical identity (type:instance) of the
      sending member.
      
      We fix this by adding a second instance of struct sockaddr_tipc to the
      source address area when a message is read. The extra address struct
      is filled in with data found in the received message header (type,) and
      in the local member representation struct (instance.)
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31c82a2d
    • J
      tipc: introduce communication groups · 75da2163
      Jon Maloy 提交于
      As a preparation for introducing flow control for multicast and datagram
      messaging we need a more strictly defined framework than we have now. A
      socket must be able keep track of exactly how many and which other
      sockets it is allowed to communicate with at any moment, and keep the
      necessary state for those.
      
      We therefore introduce a new concept we have named Communication Group.
      Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
      The call takes four parameters: 'type' serves as group identifier,
      'instance' serves as an logical member identifier, and 'scope' indicates
      the visibility of the group (node/cluster/zone). Finally, 'flags' makes
      it possible to set certain properties for the member. For now, there is
      only one flag, indicating if the creator of the socket wants to receive
      a copy of broadcast or multicast messages it is sending via the socket,
      and if wants to be eligible as destination for its own anycasts.
      
      A group is closed, i.e., sockets which have not joined a group will
      not be able to send messages to or receive messages from members of
      the group, and vice versa.
      
      Any member of a group can send multicast ('group broadcast') messages
      to all group members, optionally including itself, using the primitive
      send(). The messages are received via the recvmsg() primitive. A socket
      can only be member of one group at a time.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75da2163
    • J
      tipc: improve destination linked list · a80ae530
      Jon Maloy 提交于
      We often see a need for a linked list of destination identities,
      sometimes containing a port number, sometimes a node identity, and
      sometimes both. The currently defined struct u32_list is not generic
      enough to cover all cases, so we extend it to contain two u32 integers
      and rename it to struct tipc_dest_list.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a80ae530
    • J
      tipc: add new function for sending multiple small messages · f70d37b7
      Jon Maloy 提交于
      We see an increasing need to send multiple single-buffer messages
      of TIPC_SYSTEM_IMPORTANCE to different individual destination nodes.
      Instead of looping over the send queue and sending each buffer
      individually, as we do now, we add a new help function
      tipc_node_distr_xmit() to do this.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f70d37b7
    • J
      tipc: refactor function filter_rcv() · 64ac5f59
      Jon Maloy 提交于
      In the following commits we will need to handle multiple incoming and
      rejected/returned buffers in the function socket.c::filter_rcv().
      As a preparation for this, we generalize the function by handling
      buffer queues instead of individual buffers. We also introduce a
      help function tipc_skb_reject(), and rename filter_rcv() to
      tipc_sk_filter_rcv() in line with other functions in socket.c.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64ac5f59
    • J
      tipc: add ability to obtain node availability status from other files · 38077b8e
      Jon Maloy 提交于
      In the coming commits, functions at the socket level will need the
      ability to read the availability status of a given node. We therefore
      introduce a new function for this purpose, while renaming the existing
      static function currently having the wanted name.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38077b8e
    • J
      tipc: improve address sanity check in tipc_connect() · 23998835
      Jon Maloy 提交于
      The address given to tipc_connect() is not completely sanity checked,
      under the assumption that this will be done later in the function
      __tipc_sendmsg() when the address is used there.
      
      However, the latter functon will in the next commits serve as caller
      to several other send functions, so we want to move the corresponding
      sanity check there to the beginning of that function, before we possibly
      need to grab the address stored by tipc_connect(). We must therefore
      be able to trust that this address already has been thoroughly checked.
      
      We do this in this commit.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23998835
    • J
      tipc: add ability to order and receive topology events in driver · 14c04493
      Jon Maloy 提交于
      As preparation for introducing communication groups, we add the ability
      to issue topology subscriptions and receive topology events from kernel
      space. This will make it possible for group member sockets to keep track
      of other group members.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14c04493
  12. 09 10月, 2017 2 次提交
    • J
      tipc: Unclone message at secondary destination lookup · a9e2971b
      Jon Maloy 提交于
      When a bundling message is received, the function tipc_link_input()
      calls function tipc_msg_extract() to unbundle all inner messages of
      the bundling message before adding them to input queue.
      
      The function tipc_msg_extract() just clones all inner skb for all
      inner messagges from the bundling skb. This means that the skb
      headroom of an inner message overlaps with the data part of the
      preceding message in the bundle.
      
      If the message in question is a name addressed message, it may be
      subject to a secondary destination lookup, and eventually be sent out
      on one of the interfaces again. But, since what is perceived as headroom
      by the device driver in reality is the last bytes of the preceding
      message in the bundle, the latter will be overwritten by the MAC
      addresses of the L2 header. If the preceding message has not yet been
      consumed by the user, it will evenually be delivered with corrupted
      contents.
      
      This commit fixes this by uncloning all messages passing through the
      function tipc_msg_lookup_dest(), hence ensuring that the headroom
      is always valid when the message is passed on.
      Signed-off-by: NTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9e2971b
    • J
      tipc: correct initialization of skb list · 3382605f
      Jon Maloy 提交于
      We change the initialization of the skb transmit buffer queues
      in the functions tipc_bcast_xmit() and tipc_rcast_xmit() to also
      initialize their spinlocks. This is needed because we may, during
      error conditions, need to call skb_queue_purge() on those queues
      further down the stack.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3382605f
  13. 01 10月, 2017 1 次提交
  14. 07 9月, 2017 1 次提交
  15. 30 8月, 2017 1 次提交
  16. 25 8月, 2017 4 次提交
  17. 24 8月, 2017 1 次提交
  18. 23 8月, 2017 1 次提交
    • Y
      tipc: fix a race condition of releasing subscriber object · fd849b7c
      Ying Xue 提交于
      No matter whether a request is inserted into workqueue as a work item
      to cancel a subscription or to delete a subscription's subscriber
      asynchronously, the work items may be executed in different workers.
      As a result, it doesn't mean that one request which is raised prior to
      another request is definitely handled before the latter. By contrast,
      if the latter request is executed before the former request, below
      error may happen:
      
      [  656.183644] BUG: spinlock bad magic on CPU#0, kworker/u8:0/12117
      [  656.184487] general protection fault: 0000 [#1] SMP
      [  656.185160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: ip6_udp_tunnel]
      [  656.187003] CPU: 0 PID: 12117 Comm: kworker/u8:0 Not tainted 4.11.0-rc7+ #6
      [  656.187920] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  656.188690] Workqueue: tipc_rcv tipc_recv_work [tipc]
      [  656.189371] task: ffff88003f5cec40 task.stack: ffffc90004448000
      [  656.190157] RIP: 0010:spin_bug+0xdd/0xf0
      [  656.190678] RSP: 0018:ffffc9000444bcb8 EFLAGS: 00010202
      [  656.191375] RAX: 0000000000000034 RBX: ffff88003f8d1388 RCX: 0000000000000000
      [  656.192321] RDX: ffff88003ba13708 RSI: ffff88003ba0cd08 RDI: ffff88003ba0cd08
      [  656.193265] RBP: ffffc9000444bcd0 R08: 0000000000000030 R09: 000000006b6b6b6b
      [  656.194208] R10: ffff8800bde3e000 R11: 00000000000001b4 R12: 6b6b6b6b6b6b6b6b
      [  656.195157] R13: ffffffff81a3ca64 R14: ffff88003f8d1388 R15: ffff88003f8d13a0
      [  656.196101] FS:  0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
      [  656.197172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  656.197935] CR2: 00007f0b3d2e6000 CR3: 000000003ef9e000 CR4: 00000000000006f0
      [  656.198873] Call Trace:
      [  656.199210]  do_raw_spin_lock+0x66/0xa0
      [  656.199735]  _raw_spin_lock_bh+0x19/0x20
      [  656.200258]  tipc_subscrb_subscrp_delete+0x28/0xf0 [tipc]
      [  656.200990]  tipc_subscrb_rcv_cb+0x45/0x260 [tipc]
      [  656.201632]  tipc_receive_from_sock+0xaf/0x100 [tipc]
      [  656.202299]  tipc_recv_work+0x2b/0x60 [tipc]
      [  656.202872]  process_one_work+0x157/0x420
      [  656.203404]  worker_thread+0x69/0x4c0
      [  656.203898]  kthread+0x138/0x170
      [  656.204328]  ? process_one_work+0x420/0x420
      [  656.204889]  ? kthread_create_on_node+0x40/0x40
      [  656.205527]  ret_from_fork+0x29/0x40
      [  656.206012] Code: 48 8b 0c 25 00 c5 00 00 48 c7 c7 f0 24 a3 81 48 81 c1 f0 05 00 00 65 8b 15 61 ef f5 7e e8 9a 4c 09 00 4d 85 e4 44 8b 4b 08 74 92 <45> 8b 84 24 40 04 00 00 49 8d 8c 24 f0 05 00 00 eb 8d 90 0f 1f
      [  656.208504] RIP: spin_bug+0xdd/0xf0 RSP: ffffc9000444bcb8
      [  656.209798] ---[ end trace e2a800e6eb0770be ]---
      
      In above scenario, the request of deleting subscriber was performed
      earlier than the request of canceling a subscription although the
      latter was issued before the former, which means tipc_subscrb_delete()
      was called before tipc_subscrp_cancel(). As a result, when
      tipc_subscrb_subscrp_delete() called by tipc_subscrp_cancel() was
      executed to cancel a subscription, the subscription's subscriber
      refcnt had been decreased to 1. After tipc_subscrp_delete() where
      the subscriber was freed because its refcnt was decremented to zero,
      but the subscriber's lock had to be released, as a consequence, panic
      happened.
      
      By contrast, if we increase subscriber's refcnt before
      tipc_subscrb_subscrp_delete() is called in tipc_subscrp_cancel(),
      the panic issue can be avoided.
      
      Fixes: d094c4d5 ("tipc: add subscription refcount to avoid invalid delete")
      Reported-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd849b7c