1. 28 7月, 2021 1 次提交
  2. 30 6月, 2021 1 次提交
  3. 29 6月, 2021 3 次提交
  4. 23 6月, 2021 3 次提交
    • X
      sctp: process sctp over udp icmp err on sctp side · 9e47df00
      Xin Long 提交于
      Previously, sctp over udp was using udp tunnel's icmp err process, which
      only does sk lookup on sctp side. However for sctp's icmp error process,
      there are more things to do, like syncing assoc pmtu/retransmit packets
      for toobig type err, and starting proto_unreach_timer for unreach type
      err etc.
      
      Now after adding PLPMTUD, which also requires to process toobig type err
      on sctp side. This patch is to process icmp err on sctp side by parsing
      the type/code/info in .encap_err_lookup and call sctp's icmp processing
      functions. Note as the 'redirect' err process needs to know the outer
      ip(v6) header's, we have to leave it to udp(v6)_err to handle it.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e47df00
    • X
      sctp: extract sctp_v4_err_handle function from sctp_v4_err · d8306075
      Xin Long 提交于
      This patch is to extract sctp_v4_err_handle() from sctp_v4_err() to
      only handle the icmp err after the sock lookup, and it also makes
      the code clearer.
      
      sctp_v4_err_handle() will be used in sctp over udp's err handling
      in the following patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8306075
    • X
      sctp: do state transition when receiving an icmp TOOBIG packet · 83696408
      Xin Long 提交于
      PLPMTUD will short-circuit the old process for icmp TOOBIG packets.
      This part is described in rfc8899#section-4.6.2 (PL_PTB_SIZE =
      PTB_SIZE - other_headers_len). Note that from rfc8899#section-5.2
      State Machine, each case below is for some specific states only:
      
        a) PL_PTB_SIZE < MIN_PLPMTU || PL_PTB_SIZE >= PROBED_SIZE,
           discard it, for any state
      
        b) MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU,
           Base -> Error, for Base state
      
        c) BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU,
           Search -> Base or Complete -> Base, for Search and Complete states.
      
        d) PLPMTU < PL_PTB_SIZE < PROBED_SIZE,
           set pl.probe_size to PL_PTB_SIZE then verify it, for Search state.
      
      The most important one is case d), which will help find the optimal
      fast during searching. Like when pathmtu = 1392 for SCTP over IPv4,
      the search will be (20 is iphdr_len):
      
        1. probe with 1200 - 20
        2. probe with 1232 - 20
        3. probe with 1264 - 20
        ...
        7. probe with 1388 - 20
        8. probe with 1420 - 20
      
      When sending the probe with 1420 - 20, TOOBIG may come with PL_PTB_SIZE =
      1392 - 20. Then it matches case d), and saves some rounds to try with the
      1392 - 20 probe. But of course, PLPMTUD doesn't trust TOOBIG packets, and
      it will go back to the common searching once the probe with the new size
      can't be verified.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83696408
  5. 18 5月, 2021 1 次提交
  6. 15 11月, 2020 1 次提交
    • X
      sctp: change to hold/put transport for proto_unreach_timer · 057a10fa
      Xin Long 提交于
      A call trace was found in Hangbin's Codenomicon testing with debug kernel:
      
        [ 2615.981988] ODEBUG: free active (active state 0) object type: timer_list hint: sctp_generate_proto_unreach_event+0x0/0x3a0 [sctp]
        [ 2615.995050] WARNING: CPU: 17 PID: 0 at lib/debugobjects.c:328 debug_print_object+0x199/0x2b0
        [ 2616.095934] RIP: 0010:debug_print_object+0x199/0x2b0
        [ 2616.191533] Call Trace:
        [ 2616.194265]  <IRQ>
        [ 2616.202068]  debug_check_no_obj_freed+0x25e/0x3f0
        [ 2616.207336]  slab_free_freelist_hook+0xeb/0x140
        [ 2616.220971]  kfree+0xd6/0x2c0
        [ 2616.224293]  rcu_do_batch+0x3bd/0xc70
        [ 2616.243096]  rcu_core+0x8b9/0xd00
        [ 2616.256065]  __do_softirq+0x23d/0xacd
        [ 2616.260166]  irq_exit+0x236/0x2a0
        [ 2616.263879]  smp_apic_timer_interrupt+0x18d/0x620
        [ 2616.269138]  apic_timer_interrupt+0xf/0x20
        [ 2616.273711]  </IRQ>
      
      This is because it holds asoc when transport->proto_unreach_timer starts
      and puts asoc when the timer stops, and without holding transport the
      transport could be freed when the timer is still running.
      
      So fix it by holding/putting transport instead for proto_unreach_timer
      in transport, just like other timers in transport.
      
      v1->v2:
        - Also use sctp_transport_put() for the "out_unlock:" path in
          sctp_generate_proto_unreach_event(), as Marcelo noticed.
      
      Fixes: 50b5d6ad ("sctp: Fix a race between ICMP protocol unreachable and connect()")
      Reported-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/102788809b554958b13b95d33440f5448113b8d6.1605331373.git.lucien.xin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      057a10fa
  7. 25 2月, 2020 1 次提交
  8. 10 12月, 2019 1 次提交
  9. 24 11月, 2019 1 次提交
    • X
      sctp: cache netns in sctp_ep_common · 31243461
      Xin Long 提交于
      This patch is to fix a data-race reported by syzbot:
      
        BUG: KCSAN: data-race in sctp_assoc_migrate / sctp_hash_obj
      
        write to 0xffff8880b67c0020 of 8 bytes by task 18908 on cpu 1:
          sctp_assoc_migrate+0x1a6/0x290 net/sctp/associola.c:1091
          sctp_sock_migrate+0x8aa/0x9b0 net/sctp/socket.c:9465
          sctp_accept+0x3c8/0x470 net/sctp/socket.c:4916
          inet_accept+0x7f/0x360 net/ipv4/af_inet.c:734
          __sys_accept4+0x224/0x430 net/socket.c:1754
          __do_sys_accept net/socket.c:1795 [inline]
          __se_sys_accept net/socket.c:1792 [inline]
          __x64_sys_accept+0x4e/0x60 net/socket.c:1792
          do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        read to 0xffff8880b67c0020 of 8 bytes by task 12003 on cpu 0:
          sctp_hash_obj+0x4f/0x2d0 net/sctp/input.c:894
          rht_key_get_hash include/linux/rhashtable.h:133 [inline]
          rht_key_hashfn include/linux/rhashtable.h:159 [inline]
          rht_head_hashfn include/linux/rhashtable.h:174 [inline]
          head_hashfn lib/rhashtable.c:41 [inline]
          rhashtable_rehash_one lib/rhashtable.c:245 [inline]
          rhashtable_rehash_chain lib/rhashtable.c:276 [inline]
          rhashtable_rehash_table lib/rhashtable.c:316 [inline]
          rht_deferred_worker+0x468/0xab0 lib/rhashtable.c:420
          process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
          worker_thread+0xa0/0x800 kernel/workqueue.c:2415
          kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
          ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      It was caused by rhashtable access asoc->base.sk when sctp_assoc_migrate
      is changing its value. However, what rhashtable wants is netns from asoc
      base.sk, and for an asoc, its netns won't change once set. So we can
      simply fix it by caching netns since created.
      
      Fixes: d6c0256a ("sctp: add the rhashtable apis for sctp global transport hashtable")
      Reported-by: syzbot+e3b35fe7918ff0ee474e@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      31243461
  10. 10 10月, 2019 2 次提交
    • E
      net: silence KCSAN warnings around sk_add_backlog() calls · 8265792b
      Eric Dumazet 提交于
      sk_add_backlog() callers usually read sk->sk_rcvbuf without
      owning the socket lock. This means sk_rcvbuf value can
      be changed by other cpus, and KCSAN complains.
      
      Add READ_ONCE() annotations to document the lockless nature
      of these reads.
      
      Note that writes over sk_rcvbuf should also use WRITE_ONCE(),
      but this will be done in separate patches to ease stable
      backports (if we decide this is relevant for stable trees).
      
      BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg
      
      write to 0xffff88812ab369f8 of 8 bytes by interrupt on cpu 1:
       __sk_add_backlog include/net/sock.h:902 [inline]
       sk_add_backlog include/net/sock.h:933 [inline]
       tcp_add_backlog+0x45a/0xcc0 net/ipv4/tcp_ipv4.c:1737
       tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925
       ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
       netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
       napi_skb_finish net/core/dev.c:5671 [inline]
       napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
       receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
       virtnet_receive drivers/net/virtio_net.c:1323 [inline]
       virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
       napi_poll net/core/dev.c:6352 [inline]
       net_rx_action+0x3ae/0xa50 net/core/dev.c:6418
      
      read to 0xffff88812ab369f8 of 8 bytes by task 7271 on cpu 0:
       tcp_recvmsg+0x470/0x1a30 net/ipv4/tcp.c:2047
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       sock_read_iter+0x15f/0x1e0 net/socket.c:967
       call_read_iter include/linux/fs.h:1864 [inline]
       new_sync_read+0x389/0x4f0 fs/read_write.c:414
       __vfs_read+0xb1/0xc0 fs/read_write.c:427
       vfs_read fs/read_write.c:461 [inline]
       vfs_read+0x143/0x2c0 fs/read_write.c:446
       ksys_read+0xd5/0x1b0 fs/read_write.c:587
       __do_sys_read fs/read_write.c:597 [inline]
       __se_sys_read fs/read_write.c:595 [inline]
       __x64_sys_read+0x4c/0x60 fs/read_write.c:595
       do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 7271 Comm: syz-fuzzer Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      8265792b
    • X
      sctp: add chunks to sk_backlog when the newsk sk_socket is not set · 819be810
      Xin Long 提交于
      This patch is to fix a NULL-ptr deref in selinux_socket_connect_helper:
      
        [...] kasan: GPF could be caused by NULL-ptr deref or user memory access
        [...] RIP: 0010:selinux_socket_connect_helper+0x94/0x460
        [...] Call Trace:
        [...]  selinux_sctp_bind_connect+0x16a/0x1d0
        [...]  security_sctp_bind_connect+0x58/0x90
        [...]  sctp_process_asconf+0xa52/0xfd0 [sctp]
        [...]  sctp_sf_do_asconf+0x785/0x980 [sctp]
        [...]  sctp_do_sm+0x175/0x5a0 [sctp]
        [...]  sctp_assoc_bh_rcv+0x285/0x5b0 [sctp]
        [...]  sctp_backlog_rcv+0x482/0x910 [sctp]
        [...]  __release_sock+0x11e/0x310
        [...]  release_sock+0x4f/0x180
        [...]  sctp_accept+0x3f9/0x5a0 [sctp]
        [...]  inet_accept+0xe7/0x720
      
      It was caused by that the 'newsk' sk_socket was not set before going to
      security sctp hook when processing asconf chunk with SCTP_PARAM_ADD_IP
      or SCTP_PARAM_SET_PRIMARY:
      
        inet_accept()->
          sctp_accept():
            lock_sock():
                lock listening 'sk'
                                                do_softirq():
                                                  sctp_rcv():  <-- [1]
                                                      asconf chunk arrives and
                                                      enqueued in 'sk' backlog
            sctp_sock_migrate():
                set asoc's sk to 'newsk'
            release_sock():
                sctp_backlog_rcv():
                  lock 'newsk'
                  sctp_process_asconf()  <-- [2]
                  unlock 'newsk'
          sock_graft():
              set sk_socket  <-- [3]
      
      As it shows, at [1] the asconf chunk would be put into the listening 'sk'
      backlog, as accept() was holding its sock lock. Then at [2] asconf would
      get processed with 'newsk' as asoc's sk had been set to 'newsk'. However,
      'newsk' sk_socket is not set until [3], while selinux_sctp_bind_connect()
      would deref it, then kernel crashed.
      
      Here to fix it by adding the chunk to sk_backlog until newsk sk_socket is
      set when .accept() is done.
      
      Note that sk->sk_socket can be NULL when the sock is closed, so SOCK_DEAD
      flag is also needed to check in sctp_newsk_ready().
      
      Thanks to Ondrej for reviewing the code.
      
      Fixes: d452930f ("selinux: Add SCTP support")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      819be810
  11. 02 10月, 2019 1 次提交
    • F
      netfilter: drop bridge nf reset from nf_reset · 895b5c9f
      Florian Westphal 提交于
      commit 174e2381
      ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
      recycle always drop skb extensions.  The additional skb_ext_del() that is
      performed via nf_reset on napi skb recycle is not needed anymore.
      
      Most nf_reset() calls in the stack are there so queued skb won't block
      'rmmod nf_conntrack' indefinitely.
      
      This removes the skb_ext_del from nf_reset, and renames it to a more
      fitting nf_reset_ct().
      
      In a few selected places, add a call to skb_ext_reset to make sure that
      no active extensions remain.
      
      I am submitting this for "net", because we're still early in the release
      cycle.  The patch applies to net-next too, but I think the rename causes
      needless divergence between those trees.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      895b5c9f
  12. 24 5月, 2019 1 次提交
  13. 13 11月, 2018 2 次提交
  14. 09 11月, 2018 1 次提交
    • S
      net: Convert protocol error handlers from void to int · 32bbd879
      Stefano Brivio 提交于
      We'll need this to handle ICMP errors for tunnels without a sending socket
      (i.e. FoU and GUE). There, we might have to look up different types of IP
      tunnels, registered as network protocols, before we get a match, so we
      want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both
      inet_protos and inet6_protos. These error codes will be used in the next
      patch.
      
      For consistency, return sensible error codes in protocol error handlers
      whenever handlers can't handle errors because, even if valid, they don't
      match a protocol or any of its states.
      
      This has no effect on existing error handling paths.
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32bbd879
  15. 16 10月, 2018 1 次提交
    • X
      sctp: use the pmtu from the icmp packet to update transport pathmtu · d805397c
      Xin Long 提交于
      Other than asoc pmtu sync from all transports, sctp_assoc_sync_pmtu
      is also processing transport pmtu_pending by icmp packets. But it's
      meaningless to use sctp_dst_mtu(t->dst) as new pmtu for a transport.
      
      The right pmtu value should come from the icmp packet, and it would
      be saved into transport->mtu_info in this patch and used later when
      the pmtu sync happens in sctp_sendmsg_to_asoc or sctp_packet_config.
      
      Besides, without this patch, as pmtu can only be updated correctly
      when receiving a icmp packet and no place is holding sock lock, it
      will take long time if the sock is busy with sending packets.
      
      Note that it doesn't process transport->mtu_info in .release_cb(),
      as there is no enough information for pmtu update, like for which
      asoc or transport. It is not worth traversing all asocs to check
      pmtu_pending. So unlike tcp, sctp does this in tx path, for which
      mtu_info needs to be atomic_t.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d805397c
  16. 22 6月, 2018 1 次提交
    • N
      rhashtable: split rhashtable.h · 0eb71a9d
      NeilBrown 提交于
      Due to the use of rhashtables in net namespaces,
      rhashtable.h is included in lots of the kernel,
      so a small changes can required a large recompilation.
      This makes development painful.
      
      This patch splits out rhashtable-types.h which just includes
      the major type declarations, and does not include (non-trivial)
      inline code.  rhashtable.h is no longer included by anything
      in the include/ directory.
      Common include files only include rhashtable-types.h so a large
      recompilation is only triggered when that changes.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eb71a9d
  17. 27 3月, 2018 1 次提交
  18. 10 3月, 2018 1 次提交
  19. 13 2月, 2018 1 次提交
  20. 09 1月, 2018 2 次提交
    • M
      sctp: fix the handling of ICMP Frag Needed for too small MTUs · b6c5734d
      Marcelo Ricardo Leitner 提交于
      syzbot reported a hang involving SCTP, on which it kept flooding dmesg
      with the message:
      [  246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
      low, using default minimum of 512
      
      That happened because whenever SCTP hits an ICMP Frag Needed, it tries
      to adjust to the new MTU and triggers an immediate retransmission. But
      it didn't consider the fact that MTUs smaller than the SCTP minimum MTU
      allowed (512) would not cause the PMTU to change, and issued the
      retransmission anyway (thus leading to another ICMP Frag Needed, and so
      on).
      
      As IPv4 (ip_rt_min_pmtu=556) and IPv6 (IPV6_MIN_MTU=1280) minimum MTU
      are higher than that, sctp_transport_update_pmtu() is changed to
      re-fetch the PMTU that got set after our request, and with that, detect
      if there was an actual change or not.
      
      The fix, thus, skips the immediate retransmission if the received ICMP
      resulted in no change, in the hope that SCTP will select another path.
      
      Note: The value being used for the minimum MTU (512,
      SCTP_DEFAULT_MINSEGMENT) is not right and instead it should be (576,
      SCTP_MIN_PMTU), but such change belongs to another patch.
      
      Changes from v1:
      - do not disable PMTU discovery, in the light of commit
      06ad3919 ("[SCTP] Don't disable PMTU discovery when mtu is small")
      and as suggested by Xin Long.
      - changed the way to break the rtx loop by detecting if the icmp
        resulted in a change or not
      Changes from v2:
      none
      
      See-also: https://lkml.org/lkml/2017/12/22/811Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6c5734d
    • M
      sctp: do not retransmit upon FragNeeded if PMTU discovery is disabled · cc35c3d1
      Marcelo Ricardo Leitner 提交于
      Currently, if PMTU discovery is disabled on a given transport, but the
      configured value is higher than the actual PMTU, it is likely that we
      will get some icmp Frag Needed. The issue is, if PMTU discovery is
      disabled, we won't update the information and will issue a
      retransmission immediately, which may very well trigger another ICMP,
      and another retransmission, leading to a loop.
      
      The fix is to simply not trigger immediate retransmissions if PMTU
      discovery is disabled on the given transport.
      
      Changes from v2:
      - updated stale comment, noticed by Xin Long
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc35c3d1
  21. 29 10月, 2017 1 次提交
  22. 20 10月, 2017 1 次提交
  23. 04 8月, 2017 1 次提交
  24. 02 7月, 2017 2 次提交
  25. 27 5月, 2017 1 次提交
  26. 05 4月, 2017 1 次提交
  27. 02 3月, 2017 1 次提交
  28. 20 2月, 2017 1 次提交
    • X
      sctp: check duplicate node before inserting a new transport · cd2b7087
      Xin Long 提交于
      sctp has changed to use rhlist for transport rhashtable since commit
      7fda702f ("sctp: use new rhlist interface on sctp transport
      rhashtable").
      
      But rhltable_insert_key doesn't check the duplicate node when inserting
      a node, unlike rhashtable_lookup_insert_key. It may cause duplicate
      assoc/transport in rhashtable. like:
      
       client (addr A, B)                 server (addr X, Y)
          connect to X           INIT (1)
                              ------------>
          connect to Y           INIT (2)
                              ------------>
                               INIT_ACK (1)
                              <------------
                               INIT_ACK (2)
                              <------------
      
      After sending INIT (2), one transport will be created and hashed into
      rhashtable. But when receiving INIT_ACK (1) and processing the address
      params, another transport will be created and hashed into rhashtable
      with the same addr Y and EP as the last transport. This will confuse
      the assoc/transport's lookup.
      
      This patch is to fix it by returning err if any duplicate node exists
      before inserting it.
      
      Fixes: 7fda702f ("sctp: use new rhlist interface on sctp transport rhashtable")
      Reported-by: NFabio M. Di Nitto <fdinitto@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd2b7087
  29. 29 12月, 2016 1 次提交
  30. 17 11月, 2016 1 次提交
    • X
      sctp: use new rhlist interface on sctp transport rhashtable · 7fda702f
      Xin Long 提交于
      Now sctp transport rhashtable uses hash(lport, dport, daddr) as the key
      to hash a node to one chain. If in one host thousands of assocs connect
      to one server with the same lport and different laddrs (although it's
      not a normal case), all the transports would be hashed into the same
      chain.
      
      It may cause to keep returning -EBUSY when inserting a new node, as the
      chain is too long and sctp inserts a transport node in a loop, which
      could even lead to system hangs there.
      
      The new rhlist interface works for this case that there are many nodes
      with the same key in one chain. It puts them into a list then makes this
      list be as a node of the chain.
      
      This patch is to replace rhashtable_ interface with rhltable_ interface.
      Since a chain would not be too long and it would not return -EBUSY with
      this fix when inserting a node, the reinsert loop is also removed here.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fda702f
  31. 01 11月, 2016 2 次提交
    • X
      sctp: hold transport instead of assoc when lookup assoc in rx path · dae399d7
      Xin Long 提交于
      Prior to this patch, in rx path, before calling lock_sock, it needed to
      hold assoc when got it by __sctp_lookup_association, in case other place
      would free/put assoc.
      
      But in __sctp_lookup_association, it lookup and hold transport, then got
      assoc by transport->assoc, then hold assoc and put transport. It means
      it didn't hold transport, yet it was returned and later on directly
      assigned to chunk->transport.
      
      Without the protection of sock lock, the transport may be freed/put by
      other places, which would cause a use-after-free issue.
      
      This patch is to fix this issue by holding transport instead of assoc.
      As holding transport can make sure to access assoc is also safe, and
      actually it looks up assoc by searching transport rhashtable, to hold
      transport here makes more sense.
      
      Note that the function will be renamed later on on another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dae399d7
    • X
      sctp: return back transport in __sctp_rcv_init_lookup · 7c17fcc7
      Xin Long 提交于
      Prior to this patch, it used a local variable to save the transport that is
      looked up by __sctp_lookup_association(), and didn't return it back. But in
      sctp_rcv, it is used to initialize chunk->transport. So when hitting this,
      even if it found the transport, it was still initializing chunk->transport
      with null instead.
      
      This patch is to return the transport back through transport pointer
      that is from __sctp_rcv_lookup_harder().
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c17fcc7