1. 10 1月, 2018 4 次提交
  2. 09 1月, 2018 1 次提交
  3. 06 1月, 2018 2 次提交
  4. 03 1月, 2018 1 次提交
    • J
      tipc: fix problems with multipoint-to-point flow control · f9c935db
      Jon Maloy 提交于
      In commit 04d7b574 ("tipc: add multipoint-to-point flow control") we
      introduced a protocol for preventing buffer overflow when many group
      members try to simultaneously send messages to the same receiving member.
      
      Stress test of this mechanism has revealed a couple of related bugs:
      
      - When the receiving member receives an advertisement REMIT message from
        one of the senders, it will sometimes prematurely activate a pending
        member and send it the remitted advertisement, although the upper
        limit for active senders has been reached. This leads to accumulation
        of illegal advertisements, and eventually to messages being dropped
        because of receive buffer overflow.
      
      - When the receiving member leaves REMITTED state while a received
        message is being read, we miss to look at the pending queue, to
        activate the oldest pending peer. This leads to some pending senders
        being starved out, and never getting the opportunity to profit from
        the remitted advertisement.
      
      We fix the former in the function tipc_group_proto_rcv() by returning
      directly from the function once it becomes clear that the remitting
      peer cannot leave REMITTED state at that point.
      
      We fix the latter in the function tipc_group_update_rcv_win() by looking
      up and activate the longest pending peer when it becomes clear that the
      remitting peer now can leave REMITTED state.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9c935db
  5. 29 12月, 2017 1 次提交
  6. 27 12月, 2017 4 次提交
    • T
      tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path · 642a8439
      Tommi Rantala 提交于
      Calling tipc_mon_delete() before the monitor has been created will oops.
      This can happen in tipc_enable_bearer() error path if tipc_disc_create()
      fails.
      
      [   48.589074] BUG: unable to handle kernel paging request at 0000000000001008
      [   48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc]
      [   48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0
      [   48.592230] Oops: 0000 [#1] SMP KASAN
      [   48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G    B            4.15.0-rc4-pc64-dirty #5
      [   48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc]
      [   48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282
      [   48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000
      [   48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297
      [   48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533
      [   48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050
      [   48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020
      [   48.608228] FS:  00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000
      [   48.610189] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0
      [   48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   48.615038] Call Trace:
      [   48.615635]  tipc_enable_bearer+0x415/0x5e0 [tipc]
      [   48.620623]  tipc_nl_bearer_enable+0x1ab/0x200 [tipc]
      [   48.625118]  genl_family_rcv_msg+0x36b/0x570
      [   48.631233]  genl_rcv_msg+0x5a/0xa0
      [   48.631867]  netlink_rcv_skb+0x1cc/0x220
      [   48.636373]  genl_rcv+0x24/0x40
      [   48.637306]  netlink_unicast+0x29c/0x350
      [   48.639664]  netlink_sendmsg+0x439/0x590
      [   48.642014]  SYSC_sendto+0x199/0x250
      [   48.649912]  do_syscall_64+0xfd/0x2c0
      [   48.650651]  entry_SYSCALL64_slow_path+0x25/0x25
      [   48.651843] RIP: 0033:0x7fc6859848e3
      [   48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3
      [   48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003
      [   48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c
      [   48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010
      [   48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000
      [   48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668
      [   48.664073] CR2: 0000000000001008
      [   48.664576] ---[ end trace e811818d54d5ce88 ]---
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      642a8439
    • T
      tipc: error path leak fixes in tipc_enable_bearer() · 19142551
      Tommi Rantala 提交于
      Fix memory leak in tipc_enable_bearer() if enable_media() fails, and
      cleanup with bearer_disable() if tipc_mon_create() fails.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19142551
    • J
      tipc: fix memory leak of group member when peer node is lost · 3a33a19b
      Jon Maloy 提交于
      When a group member receives a member WITHDRAW event, this might have
      two reasons: either the peer member is leaving the group, or the link
      to the member's node has been lost.
      
      In the latter case we need to issue a DOWN event to the user right away,
      and let function tipc_group_filter_msg() perform delete of the member
      item. However, in this case we miss to change the state of the member
      item to MBR_LEAVING, so the member item is not deleted, and we have a
      memory leak.
      
      We now separate better between the four sub-cases of a WITHRAW event
      and make sure that each case is handled correctly.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a33a19b
    • J
      tipc: base group replicast ack counter on number of actual receivers · 0a3d805c
      Jon Maloy 提交于
      In commit 2f487712 ("tipc: guarantee that group broadcast doesn't
      bypass group unicast") we introduced a mechanism that requires the first
      (replicated) broadcast sent after a unicast to be acknowledged by all
      receivers before permitting sending of the next (true) broadcast.
      
      The counter for keeping track of the number of acknowledges to expect
      is based on the tipc_group::member_cnt variable. But this misses that
      some of the known members may not be ready for reception, and will never
      acknowledge the message, either because they haven't fully joined the
      group or because they are leaving the group. Such members are identified
      by not fulfilling the condition tested for in the function
      tipc_group_is_enabled().
      
      We now set the counter for the actual number of acks to receive at the
      moment the message is sent, by just counting the number of recipients
      satisfying the tipc_group_is_enabled() test.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a3d805c
  7. 21 12月, 2017 1 次提交
    • J
      tipc: remove joining group member from congested list · bb25c385
      Jon Maloy 提交于
      When we receive a JOIN message from a peer member, the message may
      contain an advertised window value ADV_IDLE that permits removing the
      member in question from the tipc_group::congested list. However, since
      the removal has been made conditional on that the advertised window is
      *not* ADV_IDLE, we miss this case. This has the effect that a sender
      sometimes may enter a state of permanent, false, broadcast congestion.
      
      We fix this by unconditinally removing the member from the congested
      list before calling tipc_member_update(), which might potentially sort
      it into the list again.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb25c385
  8. 20 12月, 2017 1 次提交
    • J
      tipc: fix list sorting bug in function tipc_group_update_member() · 3db09601
      Jon Maloy 提交于
      When, during a join operation, or during message transmission, a group
      member needs to be added to the group's 'congested' list, we sort it
      into the list in ascending order, according to its current advertised
      window size. However, we miss the case when the member is already on
      that list. This will have the result that the member, after the window
      size has been decremented, might be at the wrong position in that list.
      This again may have the effect that we during broadcast and multicast
      transmissions miss the fact that a destination is not yet ready for
      reception, and we end up sending anyway. From this point on, the
      behavior during the remaining session is unpredictable, e.g., with
      underflowing window sizes.
      
      We now correct this bug by unconditionally removing the member from
      the list before (re-)sorting it in.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3db09601
  9. 19 12月, 2017 2 次提交
    • J
      tipc: remove leaving group member from all lists · 3f42f5fe
      Jon Maloy 提交于
      A group member going into state LEAVING should never go back to any
      other state before it is finally deleted. However, this might happen
      if the socket needs to send out a RECLAIM message during this interval.
      Since we forget to remove the leaving member from the group's 'active'
      or 'pending' list, the member might be selected for reclaiming, change
      state to RECLAIMING, and get stuck in this state instead of being
      deleted. This might lead to suppression of the expected 'member down'
      event to the receiver.
      
      We fix this by removing the member from all lists, except the RB tree,
      at the moment it goes into state LEAVING.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f42f5fe
    • J
      tipc: fix lost member events bug · 23483399
      Jon Maloy 提交于
      Group messages are not supposed to be returned to sender when the
      destination socket disappears. This is done correctly for regular
      traffic messages, by setting the 'dest_droppable' bit in the header.
      But we forget to do that in group protocol messages. This has the effect
      that such messages may sometimes bounce back to the sender, be perceived
      as a legitimate peer message, and wreak general havoc for the rest of
      the session. In particular, we have seen that a member in state LEAVING
      may go back to state RECLAIMED or REMITTED, hence causing suppression
      of an otherwise expected 'member down' event to the user.
      
      We fix this by setting the 'dest_droppable' bit even in group protocol
      messages.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23483399
  10. 14 12月, 2017 1 次提交
  11. 11 12月, 2017 1 次提交
    • T
      rhashtable: Change rhashtable_walk_start to return void · 97a6ec4a
      Tom Herbert 提交于
      Most callers of rhashtable_walk_start don't care about a resize event
      which is indicated by a return value of -EAGAIN. So calls to
      rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
      like this is common:
      
             ret = rhashtable_walk_start(rhiter);
             if (ret && ret != -EAGAIN)
                     goto out;
      
      Since zero and -EAGAIN are the only possible return values from the
      function this check is pointless. The condition never evaluates to true.
      
      This patch changes rhashtable_walk_start to return void. This simplifies
      code for the callers that ignore -EAGAIN. For the few cases where the
      caller cares about the resize event, particularly where the table can be
      walked in mulitple parts for netlink or seq file dump, the function
      rhashtable_walk_start_check has been added that returns -EAGAIN on a
      resize event.
      Signed-off-by: NTom Herbert <tom@quantonium.net>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a6ec4a
  12. 06 12月, 2017 2 次提交
    • J
      tipc: fix memory leak in tipc_accept_from_sock() · a7d5f107
      Jon Maloy 提交于
      When the function tipc_accept_from_sock() fails to create an instance of
      struct tipc_subscriber it omits to free the already created instance of
      struct tipc_conn instance before it returns.
      
      We fix that with this commit.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7d5f107
    • C
      tipc: fix a null pointer deref on error path · 672ecbe1
      Cong Wang 提交于
      In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
      we call tipc_close_conn() to clean up, but in this case
      calling conn_put() is just enough.
      
      This fixes the folllowing crash:
      
       kasan: GPF could be caused by NULL-ptr deref or user memory access
       general protection fault: 0000 [#1] SMP KASAN
       Dumping ftrace buffer:
          (ftrace buffer empty)
       Modules linked in:
       CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ #137
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       task: 00000000c24413a5 task.stack: 000000005e8160b5
       RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
       RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
       RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
       RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
       R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
       R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
       FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
        _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
        spin_lock_bh include/linux/spinlock.h:320 [inline]
        tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
        tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
        tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
        tipc_close_conn+0x171/0x270 net/tipc/server.c:204
        tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
        tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
        tipc_sk_join net/tipc/socket.c:2747 [inline]
        tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
        SYSC_setsockopt net/socket.c:1851 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1830
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Fixes: 14c04493 ("tipc: add ability to order and receive topology events in driver")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      672ecbe1
  13. 02 12月, 2017 2 次提交
    • J
      tipc: fall back to smaller MTU if allocation of local send skb fails · 4c94cc2d
      Jon Maloy 提交于
      When sending node local messages the code is using an 'mtu' of 66060
      bytes to avoid unnecessary fragmentation. During situations of low
      memory tipc_msg_build() may sometimes fail to allocate such large
      buffers, resulting in unnecessary send failures. This can easily be
      remedied by falling back to a smaller MTU, and then reassemble the
      buffer chain as if the message were arriving from a remote node.
      
      At the same time, we change the initial MTU setting of the broadcast
      link to a lower value, so that large messages always are fragmented
      into smaller buffers even when we run in single node mode. Apart from
      obtaining the same advantage as for the 'fallback' solution above, this
      turns out to give a significant performance improvement. This can
      probably be explained with the __pskb_copy() operation performed on the
      buffer for each recipient during reception. We found the optimal value
      for this, considering the most relevant skb pool, to be 3744 bytes.
      Acked-by: NYing Xue <ying.xue@ericsson.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c94cc2d
    • T
      tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv() · c7799c06
      Tommi Rantala 提交于
      Remove the second tipc_rcv() call in tipc_udp_recv(). We have just
      checked that the bearer is not up, and calling tipc_rcv() with a bearer
      that is not up leads to a TIPC div-by-zero crash in
      tipc_node_calculate_timer(). The crash is rare in practice, but can
      happen like this:
      
        We're enabling a bearer, but it's not yet up and fully initialized.
        At the same time we receive a discovery packet, and in tipc_udp_recv()
        we end up calling tipc_rcv() with the not-yet-initialized bearer,
        causing later the div-by-zero crash in tipc_node_calculate_timer().
      
      Jon Maloy explains the impact of removing the second tipc_rcv() call:
        "link setup in the worst case will be delayed until the next arriving
         discovery messages, 1 sec later, and this is an acceptable delay."
      
      As the tipc_rcv() call is removed, just leave the function via the
      rcu_out label, so that we will kfree_skb().
      
      [   12.590450] Own node address <1.1.1>, network identity 1
      [   12.668088] divide error: 0000 [#1] SMP
      [   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1
      [   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
      [   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
      [   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
      [   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 0000000000000000
      [   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: ffff8c2a5b382600
      [   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 0000000000000001
      [   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2a5d8949d8
      [   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: ffff8c2a5d894800
      [   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000) knlGS:0000000000000000
      [   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 00000000003606e0
      [   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   12.712627] Call Trace:
      [   12.713390]  <IRQ>
      [   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
      [   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
      [   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
      [   12.717396]  ? minmax_running_min+0x2f/0x60
      [   12.718248]  ? dst_alloc+0x4c/0xa0
      [   12.718964]  ? tcp_ack+0xaf1/0x10b0
      [   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
      [   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
      [   12.721459]  ? dst_alloc+0x4c/0xa0
      [   12.722130]  udp_queue_rcv_skb+0x264/0x490
      [   12.722924]  __udp4_lib_rcv+0x21e/0x990
      [   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
      [   12.724442]  ? tcp_v4_rcv+0x958/0xa40
      [   12.725039]  udp_rcv+0x1a/0x20
      [   12.725587]  ip_local_deliver_finish+0x97/0x1d0
      [   12.726323]  ip_local_deliver+0xaf/0xc0
      [   12.726959]  ? ip_route_input_noref+0x19/0x20
      [   12.727689]  ip_rcv_finish+0xdd/0x3b0
      [   12.728307]  ip_rcv+0x2ac/0x360
      [   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
      [   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
      [   12.730274]  __netif_receive_skb+0x1d/0x60
      [   12.730953]  ? __netif_receive_skb+0x1d/0x60
      [   12.731637]  netif_receive_skb_internal+0x37/0xd0
      [   12.732371]  napi_gro_receive+0xc7/0xf0
      [   12.732920]  receive_buf+0x3c3/0xd40
      [   12.733441]  virtnet_poll+0xb1/0x250
      [   12.733944]  net_rx_action+0x23e/0x370
      [   12.734476]  __do_softirq+0xc5/0x2f8
      [   12.734922]  irq_exit+0xfa/0x100
      [   12.735315]  do_IRQ+0x4f/0xd0
      [   12.735680]  common_interrupt+0xa2/0xa2
      [   12.736126]  </IRQ>
      [   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
      [   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff4d
      [   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 0000000000000000
      [   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      [   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: ffffa41cc12c7e88
      [   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
      [   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 0000000000000000
      [   12.741831]  default_idle+0x2a/0x100
      [   12.742323]  arch_cpu_idle+0xf/0x20
      [   12.742796]  default_idle_call+0x28/0x40
      [   12.743312]  do_idle+0x179/0x1f0
      [   12.743761]  cpu_startup_entry+0x1d/0x20
      [   12.744291]  start_secondary+0x112/0x120
      [   12.744816]  secondary_startup_64+0xa5/0xa5
      [   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
      00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
      89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
      [   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP: ffff8c2a7fc838a0
      [   12.748555] ---[ end trace 1399ab83390650fd ]---
      [   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
      [   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
      (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [   12.751215] Rebooting in 60 seconds..
      
      Fixes: c9b64d49 ("tipc: add replicast peer discovery")
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7799c06
  14. 28 11月, 2017 1 次提交
  15. 21 11月, 2017 1 次提交
    • J
      tipc: fix access of released memory · e0e853ac
      Jon Maloy 提交于
      When the function tipc_group_filter_msg() finds that a member event
      indicates that the member is leaving the group, it first deletes the
      member instance, and then purges the message queue being handled
      by the call. But the message queue is an aggregated field in the
      just deleted item, leading the purge call to access freed memory.
      
      We fix this by swapping the order of the two actions.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0e853ac
  16. 16 11月, 2017 1 次提交
    • J
      tipc: enforce valid ratio between skb truesize and contents · d618d09a
      Jon Maloy 提交于
      The socket level flow control is based on the assumption that incoming
      buffers meet the condition (skb->truesize / roundup(skb->len) <= 4),
      where the latter value is rounded off upwards to the nearest 1k number.
      This does empirically hold true for the device drivers we know, but we
      cannot trust that it will always be so, e.g., in a system with jumbo
      frames and very small packets.
      
      We now introduce a check for this condition at packet arrival, and if
      we find it to be false, we copy the packet to a new, smaller buffer,
      where the condition will be true. We expect this to affect only a small
      fraction of all incoming packets, if at all.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d618d09a
  17. 11 11月, 2017 1 次提交
    • J
      tipc: improve link resiliency when rps is activated · 8d6e79d3
      Jon Maloy 提交于
      Currently, the TIPC RPS dissector is based only on the incoming packets'
      source node address, hence steering all traffic from a node to the same
      core. We have seen that this makes the links vulnerable to starvation
      and unnecessary resets when we turn down the link tolerance to very low
      values.
      
      To reduce the risk of this happening, we exempt probe and probe replies
      packets from the convergence to one core per source node. Instead, we do
      the opposite, - we try to diverge those packets across as many cores as
      possible, by randomizing the flow selector key.
      
      To make such packets identifiable to the dissector, we add a new
      'is_keepalive' bit to word 0 of the LINK_PROTOCOL header. This bit is
      set both for PROBE and PROBE_REPLY messages, and only for those.
      
      It should be noted that these packets are not part of any flow anyway,
      and only constitute a minuscule fraction of all packets sent across a
      link. Hence, there is no risk that this will affect overall performance.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d6e79d3
  18. 03 11月, 2017 1 次提交
    • J
      tipc: eliminate unnecessary probing · fa368826
      Jon Maloy 提交于
      The neighbor monitor employs a threshold, default set to 32 peer nodes,
      where it activates the "Overlapping Neighbor Monitoring" algorithm.
      Below that threshold, monitoring is full-mesh, and no "domain records"
      are passed between the nodes.
      
      Because of this, a node never received a peer's ack that it has received
      the most recent update of the own domain. Hence, the field 'acked_gen'
      in struct tipc_monitor_state remains permamently at zero, whereas the
      own domain generation is incremented for each added or removed peer.
      
      This has the effect that the function tipc_mon_get_state() always sets
      the field 'probing' in struct tipc_monitor_state true, again leading the
      tipc_link_timeout() of the link in question to always send out a probe,
      even when link->silent_intv_count is zero.
      
      This is functionally harmless, but leads to some unncessary probing,
      which can easily be eliminated by setting the 'probing' field of the
      said struct correctly in such cases.
      
      At the same time, we explictly invalidate the sent domain records when
      the algorithm is not activated. This will eliminate any risk that an
      invalid domain record might be inadverently accepted by the peer.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa368826
  19. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  20. 01 11月, 2017 1 次提交
  21. 26 10月, 2017 2 次提交
  22. 22 10月, 2017 1 次提交
  23. 21 10月, 2017 1 次提交
  24. 20 10月, 2017 1 次提交
  25. 17 10月, 2017 1 次提交
  26. 13 10月, 2017 4 次提交
    • J
      tipc: add multipoint-to-point flow control · 04d7b574
      Jon Maloy 提交于
      We already have point-to-multipoint flow control within a group. But
      we even need the opposite; -a scheme which can handle that potentially
      hundreds of sources may try to send messages to the same destination
      simultaneously without causing buffer overflow at the recipient. This
      commit adds such a mechanism.
      
      The algorithm works as follows:
      
      - When a member detects a new, joining member, it initially set its
        state to JOINED and advertises a minimum window to the new member.
        This window is chosen so that the new member can send exactly one
        maximum sized message, or several smaller ones, to the recipient
        before it must stop and wait for an additional advertisement. This
        minimum window ADV_IDLE is set to 65 1kB blocks.
      
      - When a member receives the first data message from a JOINED member,
        it changes the state of the latter to ACTIVE, and advertises a larger
        window ADV_ACTIVE = 12 x ADV_IDLE blocks to the sender, so it can
        continue sending with minimal disturbances to the data flow.
      
      - The active members are kept in a dedicated linked list. Each time a
        message is received from an active member, it will be moved to the
        tail of that list. This way, we keep a record of which members have
        been most (tail) and least (head) recently active.
      
      - There is a maximum number (16) of permitted simultaneous active
        senders per receiver. When this limit is reached, the receiver will
        not advertise anything immediately to a new sender, but instead put
        it in a PENDING state, and add it to a corresponding queue. At the
        same time, it will pick the least recently active member, send it an
        advertisement RECLAIM message, and set this member to state
        RECLAIMING.
      
      - The reclaimee member has to respond with a REMIT message, meaning that
        it goes back to a send window of ADV_IDLE, and returns its unused
        advertised blocks beyond that value to the reclaiming member.
      
      - When the reclaiming member receives the REMIT message, it unlinks
        the reclaimee from its active list, resets its state to JOINED, and
        notes that it is now back at ADV_IDLE advertised blocks to that
        member. If there are still unread data messages sent out by
        reclaimee before the REMIT, the member goes into an intermediate
        state REMITTED, where it stays until the said messages have been
        consumed.
      
      - The returned advertised blocks can now be re-advertised to the
        pending member, which is now set to state ACTIVE and added to
        the active member list.
      
      - To be proactive, i.e., to minimize the risk that any member will
        end up in the pending queue, we start reclaiming resources already
        when the number of active members exceeds 3/4 of the permitted
        maximum.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04d7b574
    • J
      tipc: guarantee delivery of last broadcast before DOWN event · a3bada70
      Jon Maloy 提交于
      The following scenario is possible:
      - A user sends a broadcast message, and thereafter immediately leaves
        the group.
      - The LEAVE message, following a different path than the broadcast,
        arrives ahead of the broadcast, and the sending member is removed
        from the receiver's list.
      - The broadcast message arrives, but is dropped because the sender
        now is unknown to the receipient.
      
      We fix this by sequence numbering membership events, just like ordinary
      unicast messages. Currently, when a JOIN is sent to a peer, it contains
      a synchronization point, - the sequence number of the next sent
      broadcast, in order to give the receiver a start synchronization point.
      We now let even LEAVE messages contain such an "end synchronization"
      point, so that the recipient can delay the removal of the sending member
      until it knows that all messages have been received.
      
      The received synchronization points are added as sequence numbers to the
      generated membership events, making it possible to handle them almost
      the same way as regular unicasts in the receiving filter function. In
      particular, a DOWN event with a too high sequence number will be kept
      in the reordering queue until the missing broadcast(s) arrive and have
      been delivered.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3bada70
    • J
      tipc: guarantee delivery of UP event before first broadcast · 399574d4
      Jon Maloy 提交于
      The following scenario is possible:
      - A user joins a group, and immediately sends out a broadcast message
        to its members.
      - The broadcast message, following a different data path than the
        initial JOIN message sent out during the joining procedure, arrives
        to a receiver before the latter..
      - The receiver drops the message, since it is not ready to accept any
        messages until the JOIN has arrived.
      
      We avoid this by treating group protocol JOIN messages like unicast
      messages.
      - We let them pass through the recipient's multicast input queue, just
        like ordinary unicasts.
      - We force the first following broadacst to be sent as replicated
        unicast and being acknowledged by the recipient before accepting
        any more broadcast transmissions.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      399574d4
    • J
      tipc: guarantee that group broadcast doesn't bypass group unicast · 2f487712
      Jon Maloy 提交于
      We need a mechanism guaranteeing that group unicasts sent out from a
      socket are not bypassed by later sent broadcasts from the same socket.
      We do this as follows:
      
      - Each time a unicast is sent, we set a the broadcast method for the
        socket to "replicast" and "mandatory". This forces the first
        subsequent broadcast message to follow the same network and data path
        as the preceding unicast to a destination, hence preventing it from
        overtaking the latter.
      
      - In order to make the 'same data path' statement above true, we let
        group unicasts pass through the multicast link input queue, instead
        of as previously through the unicast link input queue.
      
      - In the first broadcast following a unicast, we set a new header flag,
        requiring all recipients to immediately acknowledge its reception.
      
      - During the period before all the expected acknowledges are received,
        the socket refuses to accept any more broadcast attempts, i.e., by
        blocking or returning EAGAIN. This period should typically not be
        longer than a few microseconds.
      
      - When all acknowledges have been received, the sending socket will
        open up for subsequent broadcasts, this time giving the link layer
        freedom to itself select the best transmission method.
      
      - The forced and/or abrupt transmission method changes described above
        may lead to broadcasts arriving out of order to the recipients. We
        remedy this by introducing code that checks and if necessary
        re-orders such messages at the receiving end.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f487712