1. 17 2月, 2018 5 次提交
    • J
      tipc: simplify endianness handling in topology subscriber · 8985ecc7
      Jon Maloy 提交于
      Because of the requirement for total distribution transparency, users
      send subscriptions and receive topology events in their own host format.
      It is up to the topology server to determine this format and do the
      correct conversions to and from its own host format when needed.
      
      Until now, this has been handled in a rather non-transparent way inside
      the topology server and subscriber code, leading to unnecessary
      complexity when creating subscriptions and issuing events.
      
      We now improve this situation by adding two new macros, tipc_sub_read()
      and tipc_evt_write(). Both those functions calculate the need for
      conversion internally before performing their respective operations.
      Hence, all handling of such conversions become transparent to the rest
      of the code.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8985ecc7
    • J
      tipc: simplify interaction between subscription and topology connection · 414574a0
      Jon Maloy 提交于
      The message transmission and reception in the topology server is more
      generic than is currently necessary. By basing the funtionality on the
      fact that we only send items of type struct tipc_event and always
      receive items of struct tipc_subcr we can make several simplifications,
      and also get rid of some unnecessary dynamic memory allocations.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      414574a0
    • J
      tipc: eliminate struct tipc_subscriber · df79d040
      Jon Maloy 提交于
      It is unnecessary to keep two structures, struct tipc_conn and struct
      tipc_subscriber, with a one-to-one relationship and still with different
      life cycles. The fact that the two often run in different contexts, and
      still may access each other via direct pointers constitutes an additional
      hazard, something we have experienced at several occasions, and still
      see happening.
      
      We have identified at least two remaining problems that are easier to
      fix if we simplify the topology server data structure somewhat.
      
      - When there is a race between a subscription up/down event and a
        timeout event, it is fully possible that the former might be delivered
        after the latter, leading to confusion for the receiver.
      
      - The function tipc_subcrp_timeout() is executing in interrupt context,
        while the following call chain is at least theoretically possible:
        tipc_subscrp_timeout()
          tipc_subscrp_send_event()
            tipc_conn_sendmsg()
              conn_put()
                tipc_conn_kref_release()
                  sock_release(sock)
      
      I.e., we end up calling a function that might try to sleep in
      interrupt context. To eliminate this, we need to ensure that the
      tipc_conn structure and the socket, as well as the subscription
      instances, only are deleted in work queue context, i.e., after the
      timeout event really has been sent out.
      
      We now remove this unnecessary complexity, by merging data and
      functionality of the subscriber structure into struct tipc_conn
      and the associated file server.c. We thereafter add a spinlock and
      a new 'inactive' state to the subscription structure. Using those,
      both problems described above can be easily solved.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df79d040
    • J
      tipc: remove unnecessary function pointers · c901d26d
      Jon Maloy 提交于
      Interaction between the functionality in server.c and subscr.c is
      done via function pointers installed in struct server. This makes
      the code harder to follow, and doesn't serve any obvious purpose.
      
      Here, we replace the function pointers with direct function calls.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c901d26d
    • J
      tipc: remove redundant code in topology server · 27469b73
      Jon Maloy 提交于
      The socket handling in the topology server is unnecessarily generic.
      It is prepared to handle both SOCK_RDM, SOCK_DGRAM and SOCK_STREAM
      type sockets, as well as the only socket type which is really used,
      SOCK_SEQPACKET.
      
      We now remove this redundant code to make the code more readable.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27469b73
  2. 15 2月, 2018 1 次提交
  3. 13 2月, 2018 1 次提交
    • D
      net: make getname() functions return length rather than use int* parameter · 9b2c45d4
      Denys Vlasenko 提交于
      Changes since v1:
      Added changes in these files:
          drivers/infiniband/hw/usnic/usnic_transport.c
          drivers/staging/lustre/lnet/lnet/lib-socket.c
          drivers/target/iscsi/iscsi_target_login.c
          drivers/vhost/net.c
          fs/dlm/lowcomms.c
          fs/ocfs2/cluster/tcp.c
          security/tomoyo/network.c
      
      Before:
      All these functions either return a negative error indicator,
      or store length of sockaddr into "int *socklen" parameter
      and return zero on success.
      
      "int *socklen" parameter is awkward. For example, if caller does not
      care, it still needs to provide on-stack storage for the value
      it does not need.
      
      None of the many FOO_getname() functions of various protocols
      ever used old value of *socklen. They always just overwrite it.
      
      This change drops this parameter, and makes all these functions, on success,
      return length of sockaddr. It's always >= 0 and can be differentiated
      from an error.
      
      Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
      
      rpc_sockname() lost "int buflen" parameter, since its only use was
      to be passed to kernel_getsockname() as &buflen and subsequently
      not used in any way.
      
      Userspace API is not changed.
      
          text    data     bss      dec     hex filename
      30108430 2633624  873672 33615726 200ef6e vmlinux.before.o
      30108109 2633612  873672 33615393 200ee21 vmlinux.o
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      CC: linux-bluetooth@vger.kernel.org
      CC: linux-decnet-user@lists.sourceforge.net
      CC: linux-wireless@vger.kernel.org
      CC: linux-rdma@vger.kernel.org
      CC: linux-sctp@vger.kernel.org
      CC: linux-nfs@vger.kernel.org
      CC: linux-x25@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b2c45d4
  4. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  5. 09 2月, 2018 1 次提交
    • H
      tipc: fix skb truesize/datasize ratio control · 55b3280d
      Hoang Le 提交于
      In commit d618d09a ("tipc: enforce valid ratio between skb truesize
      and contents") we introduced a test for ensuring that the condition
      truesize/datasize <= 4 is true for a received buffer. Unfortunately this
      test has two problems.
      
      - Because of the integer arithmetics the test
        if (skb->truesize / buf_roundup_len(skb) > 4) will miss all
        ratios [4 < ratio < 5], which was not the intention.
      - The buffer returned by skb_copy() inherits skb->truesize of the
        original buffer, which doesn't help the situation at all.
      
      In this commit, we change the ratio condition and replace skb_copy()
      with a call to skb_copy_expand() to finally get this right.
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55b3280d
  6. 20 1月, 2018 1 次提交
    • J
      tipc: fix race between poll() and setsockopt() · 60c25306
      Jon Maloy 提交于
      Letting tipc_poll() dereference a socket's pointer to struct tipc_group
      entails a race risk, as the group item may be deleted in a concurrent
      tipc_sk_join() or tipc_sk_leave() thread.
      
      We now move the 'open' flag in struct tipc_group to struct tipc_sock,
      and let the former retain only a pointer to the moved field. This will
      eliminate the race risk.
      
      Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60c25306
  7. 17 1月, 2018 1 次提交
    • J
      tipc: fix race condition at topology server receive · e88f2be8
      Jon Maloy 提交于
      We have identified a race condition during reception of socket
      events and messages in the topology server.
      
      - The function tipc_close_conn() is releasing the corresponding
        struct tipc_subscriber instance without considering that there
        may still be items in the receive work queue. When those are
        scheduled, in the function tipc_receive_from_work(), they are
        using the subscriber pointer stored in struct tipc_conn, without
        first checking if this is valid or not. This will sometimes
        lead to crashes, as the next call of tipc_conn_recvmsg() will
        access the now deleted item.
        We fix this by making the usage of this pointer conditional on
        whether the connection is active or not. I.e., we check the condition
        test_bit(CF_CONNECTED) before making the call tipc_conn_recvmsg().
      
      - Since the two functions may be running on different cores, the
        condition test described above is not enough. tipc_close_conn()
        may come in between and delete the subscriber item after the condition
        test is done, but before tipc_conn_recv_msg() is finished. This
        happens less frequently than the problem described above, but leads
        to the same symptoms.
      
        We fix this by using the existing sk_callback_lock for mutual
        exclusion in the two functions. In addition, we have to move
        a call to tipc_conn_terminate() outside the mentioned lock to
        avoid deadlock.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e88f2be8
  8. 16 1月, 2018 3 次提交
  9. 10 1月, 2018 9 次提交
    • J
      tipc: improve poll() for group member socket · eb929a91
      Jon Maloy 提交于
      The current criteria for returning POLLOUT from a group member socket is
      too simplistic. It basically returns POLLOUT as soon as the group has
      external destinations, something obviously leading to a lot of spinning
      during destination congestion situations. At the same time, the internal
      congestion handling is unnecessarily complex.
      
      We now change this as follows.
      
      - We introduce an 'open' flag in  struct tipc_group. This flag is used
        only to help poll() get the setting of POLLOUT right, and *not* for
        congeston handling as such. This means that a user can choose to
        ignore an  EAGAIN for a destination and go on sending messages to
        other destinations in the group if he wants to.
      
      - The flag is set to false every time we return EAGAIN on a send call.
      
      - The flag is set to true every time any member, i.e., not necessarily
        the member that caused EAGAIN, is removed from the small_win list.
      
      - We remove the group member 'usr_pending' flag. The size of the send
        window and presence in the 'small_win' list is sufficient criteria
        for recognizing congestion.
      
      This solution seems to be a reasonable compromise between 'anycast',
      which is normally not waiting for POLLOUT for a specific destination,
      and the other three send modes, which are.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb929a91
    • J
      tipc: improve groupcast scope handling · 232d07b7
      Jon Maloy 提交于
      When a member joins a group, it also indicates a binding scope. This
      makes it possible to create both node local groups, invisible to other
      nodes, as well as cluster global groups, visible everywhere.
      
      In order to avoid that different members end up having permanently
      differing views of group size and memberhip, we must inhibit locally
      and globally bound members from joining the same group.
      
      We do this by using the binding scope as an additional separator between
      groups. I.e., a member must ignore all membership events from sockets
      using a different scope than itself, and all lookups for message
      destinations must require an exact match between the message's lookup
      scope and the potential target's binding scope.
      
      Apart from making it possible to create local groups using the same
      identity on different nodes, a side effect of this is that it now also
      becomes possible to create a cluster global group with the same identity
      across the same nodes, without interfering with the local groups.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      232d07b7
    • J
      tipc: add option to suppress PUBLISH events for pre-existing publications · 8348500f
      Jon Maloy 提交于
      Currently, when a user is subscribing for binding table publications,
      he will receive a PUBLISH event for all already existing matching items
      in the binding table.
      
      However, a group socket making a subscriptions doesn't need this initial
      status update from the binding table, because it has already scanned it
      during the join operation. Worse, the multiplicatory effect of issuing
      mutual events for dozens or hundreds group members within a short time
      frame put a heavy load on the topology server, with the end result that
      scale out operations on a big group tend to take much longer than needed.
      
      We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server
      subscriptions, so that this initial avalanche of events is suppressed.
      This change, along with the previous commit, significantly improves the
      range and speed of group scale out operations.
      
      We keep the new option internal for the tipc driver, at least for now.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8348500f
    • J
      tipc: send out join messages as soon as new member is discovered · d12d2e12
      Jon Maloy 提交于
      When a socket is joining a group, we look up in the binding table to
      find if there are already other members of the group present. This is
      used for being able to return EAGAIN instead of EHOSTUNREACH if the
      user proceeds directly to a send attempt.
      
      However, the information in the binding table can be used to directly
      set the created member in state MBR_PUBLISHED and send a JOIN message
      to the peer, instead of waiting for a topology PUBLISH event to do this.
      When there are many members in a group, the propagation time for such
      events can be significant, and we can save time during the join
      operation if we use the initial lookup result fully.
      
      In this commit, we eliminate the member state MBR_DISCOVERED which has
      been the result of the initial lookup, and do instead go directly to
      MBR_PUBLISHED, which initiates the setup.
      
      After this change, the tipc_member FSM looks as follows:
      
           +-----------+
      ---->| PUBLISHED |-----------------------------------------------+
      PUB- +-----------+                                 LEAVE/WITHRAW |
      LISH       |JOIN                                                 |
                 |     +-------------------------------------------+   |
                 |     |                            LEAVE/WITHDRAW |   |
                 |     |                +------------+             |   |
                 |     |   +----------->|  PENDING   |---------+   |   |
                 |     |   |msg/maxactv +-+---+------+  LEAVE/ |   |   |
                 |     |   |              |   |       WITHDRAW |   |   |
                 |     |   |   +----------+   |                |   |   |
                 |     |   |   |revert/maxactv|                |   |   |
                 |     |   |   V              V                V   V   V
                 |   +----------+  msg  +------------+       +-----------+
                 +-->|  JOINED  |------>|   ACTIVE   |------>|  LEAVING  |--->
                 |   +----------+       +--- -+------+ LEAVE/+-----------+DOWN
                 |        A   A               |      WITHDRAW A   A    A   EVT
                 |        |   |               |RECLAIM        |   |    |
                 |        |   |REMIT          V               |   |    |
                 |        |   |== adv   +------------+        |   |    |
                 |        |   +---------| RECLAIMING |--------+   |    |
                 |        |             +-----+------+  LEAVE/    |    |
                 |        |                   |REMIT   WITHDRAW   |    |
                 |        |                   |< adv              |    |
                 |        |msg/               V            LEAVE/ |    |
                 |        |adv==ADV_IDLE+------------+   WITHDRAW |    |
                 |        +-------------|  REMITTED  |------------+    |
                 |                      +------------+                 |
                 |PUBLISH                                              |
      JOIN +-----------+                                LEAVE/WITHDRAW |
      ---->|  JOINING  |-----------------------------------------------+
           +-----------+
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d12d2e12
    • J
      tipc: simplify group LEAVE sequence · c2b22bcf
      Jon Maloy 提交于
      After the changes in the previous commit the group LEAVE sequence
      can be simplified.
      
      We now let the arrival of a LEAVE message unconditionally issue a group
      DOWN event to the user. When a topology WITHDRAW event is received, the
      member, if it still there, is set to state LEAVING, but we only issue a
      group DOWN event when the link to the peer node is gone, so that no
      LEAVE message is to be expected.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2b22bcf
    • J
      tipc: create group member event messages when they are needed · 7ad32bcb
      Jon Maloy 提交于
      In the current implementation, a group socket receiving topology
      events about other members just converts the topology event message
      into a group event message and stores it until it reaches the right
      state to issue it to the user. This complicates the code unnecessarily,
      and becomes impractical when we in the coming commits will need to
      create and issue membership events independently.
      
      In this commit, we change this so that we just notice the type and
      origin of the incoming topology event, and then drop the buffer. Only
      when it is time to actually send a group event to the user do we
      explicitly create a new message and send it upwards.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ad32bcb
    • J
      tipc: adjustment to group member FSM · 0233493a
      Jon Maloy 提交于
      Analysis reveals that the member state MBR_QURANTINED in reality is
      unnecessary, and can be replaced by the state MBR_JOINING at all
      occurrencs.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0233493a
    • J
      tipc: let group member stay in JOINED mode if unable to reclaim · 4ea5dab5
      Jon Maloy 提交于
      We handle a corner case in the function tipc_group_update_rcv_win().
      During extreme pessure it might happen that a message receiver has all
      its active senders in RECLAIMING or REMITTED mode, meaning that there
      is nobody to reclaim advertisements from if an additional sender tries
      to go active.
      
      Currently we just set the new sender to ACTIVE anyway, hence at least
      theoretically opening up for a receiver queue overflow by exceeding the
      MAX_ACTIVE limit. The correct solution to this is to instead add the
      member to the pending queue, while letting the oldest member in that
      queue revert to JOINED state.
      
      In this commit we refactor the code for handling message arrival from
      a JOINED member, both to make it more comprehensible and to cover the
      case described above.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ea5dab5
    • J
      tipc: a couple of cleanups · 8d5dee21
      Jon Maloy 提交于
      - We remove the 'reclaiming' member list in struct tipc_group, since
        it doesn't serve any purpose.
      
      - We simplify the GRP_REMIT_MSG branch of tipc_group_protocol_rcv().
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d5dee21
  10. 09 1月, 2018 1 次提交
  11. 06 1月, 2018 2 次提交
  12. 03 1月, 2018 1 次提交
    • J
      tipc: fix problems with multipoint-to-point flow control · f9c935db
      Jon Maloy 提交于
      In commit 04d7b574 ("tipc: add multipoint-to-point flow control") we
      introduced a protocol for preventing buffer overflow when many group
      members try to simultaneously send messages to the same receiving member.
      
      Stress test of this mechanism has revealed a couple of related bugs:
      
      - When the receiving member receives an advertisement REMIT message from
        one of the senders, it will sometimes prematurely activate a pending
        member and send it the remitted advertisement, although the upper
        limit for active senders has been reached. This leads to accumulation
        of illegal advertisements, and eventually to messages being dropped
        because of receive buffer overflow.
      
      - When the receiving member leaves REMITTED state while a received
        message is being read, we miss to look at the pending queue, to
        activate the oldest pending peer. This leads to some pending senders
        being starved out, and never getting the opportunity to profit from
        the remitted advertisement.
      
      We fix the former in the function tipc_group_proto_rcv() by returning
      directly from the function once it becomes clear that the remitting
      peer cannot leave REMITTED state at that point.
      
      We fix the latter in the function tipc_group_update_rcv_win() by looking
      up and activate the longest pending peer when it becomes clear that the
      remitting peer now can leave REMITTED state.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9c935db
  13. 29 12月, 2017 1 次提交
  14. 27 12月, 2017 4 次提交
    • T
      tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path · 642a8439
      Tommi Rantala 提交于
      Calling tipc_mon_delete() before the monitor has been created will oops.
      This can happen in tipc_enable_bearer() error path if tipc_disc_create()
      fails.
      
      [   48.589074] BUG: unable to handle kernel paging request at 0000000000001008
      [   48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc]
      [   48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0
      [   48.592230] Oops: 0000 [#1] SMP KASAN
      [   48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G    B            4.15.0-rc4-pc64-dirty #5
      [   48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc]
      [   48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282
      [   48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000
      [   48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297
      [   48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533
      [   48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050
      [   48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020
      [   48.608228] FS:  00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000
      [   48.610189] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0
      [   48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   48.615038] Call Trace:
      [   48.615635]  tipc_enable_bearer+0x415/0x5e0 [tipc]
      [   48.620623]  tipc_nl_bearer_enable+0x1ab/0x200 [tipc]
      [   48.625118]  genl_family_rcv_msg+0x36b/0x570
      [   48.631233]  genl_rcv_msg+0x5a/0xa0
      [   48.631867]  netlink_rcv_skb+0x1cc/0x220
      [   48.636373]  genl_rcv+0x24/0x40
      [   48.637306]  netlink_unicast+0x29c/0x350
      [   48.639664]  netlink_sendmsg+0x439/0x590
      [   48.642014]  SYSC_sendto+0x199/0x250
      [   48.649912]  do_syscall_64+0xfd/0x2c0
      [   48.650651]  entry_SYSCALL64_slow_path+0x25/0x25
      [   48.651843] RIP: 0033:0x7fc6859848e3
      [   48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3
      [   48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003
      [   48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c
      [   48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010
      [   48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000
      [   48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668
      [   48.664073] CR2: 0000000000001008
      [   48.664576] ---[ end trace e811818d54d5ce88 ]---
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      642a8439
    • T
      tipc: error path leak fixes in tipc_enable_bearer() · 19142551
      Tommi Rantala 提交于
      Fix memory leak in tipc_enable_bearer() if enable_media() fails, and
      cleanup with bearer_disable() if tipc_mon_create() fails.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19142551
    • J
      tipc: fix memory leak of group member when peer node is lost · 3a33a19b
      Jon Maloy 提交于
      When a group member receives a member WITHDRAW event, this might have
      two reasons: either the peer member is leaving the group, or the link
      to the member's node has been lost.
      
      In the latter case we need to issue a DOWN event to the user right away,
      and let function tipc_group_filter_msg() perform delete of the member
      item. However, in this case we miss to change the state of the member
      item to MBR_LEAVING, so the member item is not deleted, and we have a
      memory leak.
      
      We now separate better between the four sub-cases of a WITHRAW event
      and make sure that each case is handled correctly.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a33a19b
    • J
      tipc: base group replicast ack counter on number of actual receivers · 0a3d805c
      Jon Maloy 提交于
      In commit 2f487712 ("tipc: guarantee that group broadcast doesn't
      bypass group unicast") we introduced a mechanism that requires the first
      (replicated) broadcast sent after a unicast to be acknowledged by all
      receivers before permitting sending of the next (true) broadcast.
      
      The counter for keeping track of the number of acknowledges to expect
      is based on the tipc_group::member_cnt variable. But this misses that
      some of the known members may not be ready for reception, and will never
      acknowledge the message, either because they haven't fully joined the
      group or because they are leaving the group. Such members are identified
      by not fulfilling the condition tested for in the function
      tipc_group_is_enabled().
      
      We now set the counter for the actual number of acks to receive at the
      moment the message is sent, by just counting the number of recipients
      satisfying the tipc_group_is_enabled() test.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a3d805c
  15. 21 12月, 2017 1 次提交
    • J
      tipc: remove joining group member from congested list · bb25c385
      Jon Maloy 提交于
      When we receive a JOIN message from a peer member, the message may
      contain an advertised window value ADV_IDLE that permits removing the
      member in question from the tipc_group::congested list. However, since
      the removal has been made conditional on that the advertised window is
      *not* ADV_IDLE, we miss this case. This has the effect that a sender
      sometimes may enter a state of permanent, false, broadcast congestion.
      
      We fix this by unconditinally removing the member from the congested
      list before calling tipc_member_update(), which might potentially sort
      it into the list again.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb25c385
  16. 20 12月, 2017 1 次提交
    • J
      tipc: fix list sorting bug in function tipc_group_update_member() · 3db09601
      Jon Maloy 提交于
      When, during a join operation, or during message transmission, a group
      member needs to be added to the group's 'congested' list, we sort it
      into the list in ascending order, according to its current advertised
      window size. However, we miss the case when the member is already on
      that list. This will have the result that the member, after the window
      size has been decremented, might be at the wrong position in that list.
      This again may have the effect that we during broadcast and multicast
      transmissions miss the fact that a destination is not yet ready for
      reception, and we end up sending anyway. From this point on, the
      behavior during the remaining session is unpredictable, e.g., with
      underflowing window sizes.
      
      We now correct this bug by unconditionally removing the member from
      the list before (re-)sorting it in.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3db09601
  17. 19 12月, 2017 2 次提交
    • J
      tipc: remove leaving group member from all lists · 3f42f5fe
      Jon Maloy 提交于
      A group member going into state LEAVING should never go back to any
      other state before it is finally deleted. However, this might happen
      if the socket needs to send out a RECLAIM message during this interval.
      Since we forget to remove the leaving member from the group's 'active'
      or 'pending' list, the member might be selected for reclaiming, change
      state to RECLAIMING, and get stuck in this state instead of being
      deleted. This might lead to suppression of the expected 'member down'
      event to the receiver.
      
      We fix this by removing the member from all lists, except the RB tree,
      at the moment it goes into state LEAVING.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f42f5fe
    • J
      tipc: fix lost member events bug · 23483399
      Jon Maloy 提交于
      Group messages are not supposed to be returned to sender when the
      destination socket disappears. This is done correctly for regular
      traffic messages, by setting the 'dest_droppable' bit in the header.
      But we forget to do that in group protocol messages. This has the effect
      that such messages may sometimes bounce back to the sender, be perceived
      as a legitimate peer message, and wreak general havoc for the rest of
      the session. In particular, we have seen that a member in state LEAVING
      may go back to state RECLAIMED or REMITTED, hence causing suppression
      of an otherwise expected 'member down' event to the user.
      
      We fix this by setting the 'dest_droppable' bit even in group protocol
      messages.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23483399
  18. 14 12月, 2017 1 次提交
  19. 11 12月, 2017 1 次提交
    • T
      rhashtable: Change rhashtable_walk_start to return void · 97a6ec4a
      Tom Herbert 提交于
      Most callers of rhashtable_walk_start don't care about a resize event
      which is indicated by a return value of -EAGAIN. So calls to
      rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
      like this is common:
      
             ret = rhashtable_walk_start(rhiter);
             if (ret && ret != -EAGAIN)
                     goto out;
      
      Since zero and -EAGAIN are the only possible return values from the
      function this check is pointless. The condition never evaluates to true.
      
      This patch changes rhashtable_walk_start to return void. This simplifies
      code for the callers that ignore -EAGAIN. For the few cases where the
      caller cares about the resize event, particularly where the table can be
      walked in mulitple parts for netlink or seq file dump, the function
      rhashtable_walk_start_check has been added that returns -EAGAIN on a
      resize event.
      Signed-off-by: NTom Herbert <tom@quantonium.net>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a6ec4a
  20. 06 12月, 2017 2 次提交
    • J
      tipc: fix memory leak in tipc_accept_from_sock() · a7d5f107
      Jon Maloy 提交于
      When the function tipc_accept_from_sock() fails to create an instance of
      struct tipc_subscriber it omits to free the already created instance of
      struct tipc_conn instance before it returns.
      
      We fix that with this commit.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7d5f107
    • C
      tipc: fix a null pointer deref on error path · 672ecbe1
      Cong Wang 提交于
      In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
      we call tipc_close_conn() to clean up, but in this case
      calling conn_put() is just enough.
      
      This fixes the folllowing crash:
      
       kasan: GPF could be caused by NULL-ptr deref or user memory access
       general protection fault: 0000 [#1] SMP KASAN
       Dumping ftrace buffer:
          (ftrace buffer empty)
       Modules linked in:
       CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ #137
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       task: 00000000c24413a5 task.stack: 000000005e8160b5
       RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
       RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
       RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
       RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
       R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
       R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
       FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
        _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
        spin_lock_bh include/linux/spinlock.h:320 [inline]
        tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
        tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
        tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
        tipc_close_conn+0x171/0x270 net/tipc/server.c:204
        tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
        tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
        tipc_sk_join net/tipc/socket.c:2747 [inline]
        tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
        SYSC_setsockopt net/socket.c:1851 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1830
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Fixes: 14c04493 ("tipc: add ability to order and receive topology events in driver")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      672ecbe1