1. 16 11月, 2019 3 次提交
  2. 15 11月, 2019 15 次提交
  3. 13 11月, 2019 10 次提交
  4. 12 11月, 2019 5 次提交
  5. 09 11月, 2019 7 次提交
    • X
      sctp: add SCTP_PEER_ADDR_THLDS_V2 sockopt · d467ac0a
      Xin Long 提交于
      Section 7.2 of rfc7829: "Peer Address Thresholds (SCTP_PEER_ADDR_THLDS)
      Socket Option" extends 'struct sctp_paddrthlds' with 'spt_pathcpthld'
      added to allow a user to change ps_retrans per sock/asoc/transport, as
      other 2 paddrthlds: pf_retrans, pathmaxrxt.
      
      Note: to not break the user's program, here to support pf_retrans dump
      and setting by adding a new sockopt SCTP_PEER_ADDR_THLDS_V2, and a new
      structure sctp_paddrthlds_v2 instead of extending sctp_paddrthlds.
      
      Also, when setting ps_retrans, the value is not allowed to be greater
      than pf_retrans.
      
      v1->v2:
        - use SCTP_PEER_ADDR_THLDS_V2 to set/get pf_retrans instead,
          as Marcelo and David Laight suggested.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d467ac0a
    • X
      sctp: add support for Primary Path Switchover · 34515e94
      Xin Long 提交于
      This is a new feature defined in section 5 of rfc7829: "Primary Path
      Switchover". By introducing a new tunable parameter:
      
        Primary.Switchover.Max.Retrans (PSMR)
      
      The primary path will be changed to another active path when the path
      error counter on the old primary path exceeds PSMR, so that "the SCTP
      sender is allowed to continue data transmission on a new working path
      even when the old primary destination address becomes active again".
      
      This patch is to add this tunable parameter, 'ps_retrans' per netns,
      sock, asoc and transport. It also allows a user to change ps_retrans
      per netns by sysctl, and ps_retrans per sock/asoc/transport will be
      initialized with it.
      
      The check will be done in sctp_do_8_2_transport_strike() when this
      feature is enabled.
      
      Note this feature is disabled by initializing 'ps_retrans' per netns
      as 0xffff by default, and its value can't be less than 'pf_retrans'
      when changing by sysctl.
      
      v3->v4:
        - add define SCTP_PS_RETRANS_MAX 0xffff, and use it on extra2 of
          sysctl 'ps_retrans'.
        - add a new entry for ps_retrans on ip-sysctl.txt.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34515e94
    • X
      sctp: add SCTP_EXPOSE_POTENTIALLY_FAILED_STATE sockopt · 8d2a6935
      Xin Long 提交于
      This is a sockopt defined in section 7.3 of rfc7829: "Exposing
      the Potentially Failed Path State", by which users can change
      pf_expose per sock and asoc.
      
      The new sockopt SCTP_EXPOSE_POTENTIALLY_FAILED_STATE is also
      known as SCTP_EXPOSE_PF_STATE for short.
      
      v2->v3:
        - return -EINVAL if params.assoc_value > SCTP_PF_EXPOSE_MAX.
        - define SCTP_EXPOSE_PF_STATE SCTP_EXPOSE_POTENTIALLY_FAILED_STATE.
      v3->v4:
        - improve changelog.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d2a6935
    • X
      sctp: add SCTP_ADDR_POTENTIALLY_FAILED notification · 768e1518
      Xin Long 提交于
      SCTP Quick failover draft section 5.1, point 5 has been removed
      from rfc7829. Instead, "the sender SHOULD (i) notify the Upper
      Layer Protocol (ULP) about this state transition", as said in
      section 3.2, point 8.
      
      So this patch is to add SCTP_ADDR_POTENTIALLY_FAILED, defined
      in section 7.1, "which is reported if the affected address
      becomes PF". Also remove transport cwnd's update when moving
      from PF back to ACTIVE , which is no longer in rfc7829 either.
      
      Note that ulp_notify will be set to false if asoc->expose is
      not 'enabled', according to last patch.
      
      v2->v3:
        - define SCTP_ADDR_PF SCTP_ADDR_POTENTIALLY_FAILED.
      v3->v4:
        - initialize spc_state with SCTP_ADDR_AVAILABLE, as Marcelo suggested.
        - check asoc->pf_expose in sctp_assoc_control_transport(), as Marcelo
          suggested.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      768e1518
    • X
      sctp: add pf_expose per netns and sock and asoc · aef587be
      Xin Long 提交于
      As said in rfc7829, section 3, point 12:
      
        The SCTP stack SHOULD expose the PF state of its destination
        addresses to the ULP as well as provide the means to notify the
        ULP of state transitions of its destination addresses from
        active to PF, and vice versa.  However, it is recommended that
        an SCTP stack implementing SCTP-PF also allows for the ULP to be
        kept ignorant of the PF state of its destinations and the
        associated state transitions, thus allowing for retention of the
        simpler state transition model of [RFC4960] in the ULP.
      
      Not only does it allow to expose the PF state to ULP, but also
      allow to ignore sctp-pf to ULP.
      
      So this patch is to add pf_expose per netns, sock and asoc. And in
      sctp_assoc_control_transport(), ulp_notify will be set to false if
      asoc->expose is not 'enabled' in next patch.
      
      It also allows a user to change pf_expose per netns by sysctl, and
      pf_expose per sock and asoc will be initialized with it.
      
      Note that pf_expose also works for SCTP_GET_PEER_ADDR_INFO sockopt,
      to not allow a user to query the state of a sctp-pf peer address
      when pf_expose is 'disabled', as said in section 7.3.
      
      v1->v2:
        - Fix a build warning noticed by Nathan Chancellor.
      v2->v3:
        - set pf_expose to UNUSED by default to keep compatible with old
          applications.
      v3->v4:
        - add a new entry for pf_expose on ip-sysctl.txt, as Marcelo suggested.
        - change this patch to 1/5, and move sctp_assoc_control_transport
          change into 2/5, as Marcelo suggested.
        - use SCTP_PF_EXPOSE_UNSET instead of SCTP_PF_EXPOSE_UNUSED, and
          set SCTP_PF_EXPOSE_UNSET to 0 in enum, as Marcelo suggested.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aef587be
    • J
      devlink: disallow reload operation during device cleanup · a0c76345
      Jiri Pirko 提交于
      There is a race between driver code that does setup/cleanup of device
      and devlink reload operation that in some drivers works with the same
      code. Use after free could we easily obtained by running:
      
      while true; do
              echo 10 > /sys/bus/netdevsim/new_device
              devlink dev reload netdevsim/netdevsim10 &
              echo 10 > /sys/bus/netdevsim/del_device
      done
      
      Fix this by enabling reload only after setup of device is complete and
      disabling it at the beginning of the cleanup process.
      Reported-by: NIdo Schimmel <idosch@mellanox.com>
      Fixes: 2d8dc5bb ("devlink: Add support for reload")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0c76345
    • E
      packet: fix data-race in fanout_flow_is_huge() · b756ad92
      Eric Dumazet 提交于
      KCSAN reported the following data-race [1]
      
      Adding a couple of READ_ONCE()/WRITE_ONCE() should silence it.
      
      Since the report hinted about multiple cpus using the history
      concurrently, I added a test avoiding writing on it if the
      victim slot already contains the desired value.
      
      [1]
      
      BUG: KCSAN: data-race in fanout_demux_rollover / fanout_demux_rollover
      
      read to 0xffff8880b01786cc of 4 bytes by task 18921 on cpu 1:
       fanout_flow_is_huge net/packet/af_packet.c:1303 [inline]
       fanout_demux_rollover+0x33e/0x3f0 net/packet/af_packet.c:1353
       packet_rcv_fanout+0x34e/0x490 net/packet/af_packet.c:1453
       deliver_skb net/core/dev.c:1888 [inline]
       dev_queue_xmit_nit+0x15b/0x540 net/core/dev.c:1958
       xmit_one net/core/dev.c:3195 [inline]
       dev_hard_start_xmit+0x3f5/0x430 net/core/dev.c:3215
       __dev_queue_xmit+0x14ab/0x1b40 net/core/dev.c:3792
       dev_queue_xmit+0x21/0x30 net/core/dev.c:3825
       neigh_direct_output+0x1f/0x30 net/core/neighbour.c:1530
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       ip6_local_out+0x74/0x90 net/ipv6/output_core.c:179
       ip6_send_skb+0x53/0x110 net/ipv6/ip6_output.c:1795
       udp_v6_send_skb.isra.0+0x3ec/0xa70 net/ipv6/udp.c:1173
       udpv6_sendmsg+0x1906/0x1c20 net/ipv6/udp.c:1471
       inet6_sendmsg+0x6d/0x90 net/ipv6/af_inet6.c:576
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0x9f/0xc0 net/socket.c:657
       ___sys_sendmsg+0x2b7/0x5d0 net/socket.c:2311
       __sys_sendmmsg+0x123/0x350 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x64/0x80 net/socket.c:2439
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      write to 0xffff8880b01786cc of 4 bytes by task 18922 on cpu 0:
       fanout_flow_is_huge net/packet/af_packet.c:1306 [inline]
       fanout_demux_rollover+0x3a4/0x3f0 net/packet/af_packet.c:1353
       packet_rcv_fanout+0x34e/0x490 net/packet/af_packet.c:1453
       deliver_skb net/core/dev.c:1888 [inline]
       dev_queue_xmit_nit+0x15b/0x540 net/core/dev.c:1958
       xmit_one net/core/dev.c:3195 [inline]
       dev_hard_start_xmit+0x3f5/0x430 net/core/dev.c:3215
       __dev_queue_xmit+0x14ab/0x1b40 net/core/dev.c:3792
       dev_queue_xmit+0x21/0x30 net/core/dev.c:3825
       neigh_direct_output+0x1f/0x30 net/core/neighbour.c:1530
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       ip6_local_out+0x74/0x90 net/ipv6/output_core.c:179
       ip6_send_skb+0x53/0x110 net/ipv6/ip6_output.c:1795
       udp_v6_send_skb.isra.0+0x3ec/0xa70 net/ipv6/udp.c:1173
       udpv6_sendmsg+0x1906/0x1c20 net/ipv6/udp.c:1471
       inet6_sendmsg+0x6d/0x90 net/ipv6/af_inet6.c:576
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0x9f/0xc0 net/socket.c:657
       ___sys_sendmsg+0x2b7/0x5d0 net/socket.c:2311
       __sys_sendmmsg+0x123/0x350 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x64/0x80 net/socket.c:2439
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 18922 Comm: syz-executor.3 Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 3b3a5b0a ("packet: rollover huge flows before small flows")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b756ad92