1. 05 10月, 2017 1 次提交
    • K
      timer: Remove expires and data arguments from DEFINE_TIMER · 1d27e3e2
      Kees Cook 提交于
      Drop the arguments from the macro and adjust all callers with the
      following script:
      
        perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \
          $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h)
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # for m68k parts
      Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog parts
      Acked-by: David S. Miller <davem@davemloft.net> # for networking parts
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: Kalle Valo <kvalo@codeaurora.org> # for wireless parts
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: linux-mips@linux-mips.org
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Sebastian Reichel <sre@kernel.org>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux1394-devel@lists.sourceforge.net
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-wireless@vger.kernel.org
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Harish Patil <harish.patil@cavium.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Michael Reed <mdr@sgi.com>
      Cc: Manish Chopra <manish.chopra@cavium.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-pm@vger.kernel.org
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Mark Gross <mark.gross@intel.com>
      Cc: linux-watchdog@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: netdev@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
      Link: https://lkml.kernel.org/r/1507159627-127660-11-git-send-email-keescook@chromium.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1d27e3e2
  2. 29 9月, 2017 1 次提交
    • L
      Revert "Bluetooth: Add option for disabling legacy ioctl interfaces" · e49aa15e
      Linus Torvalds 提交于
      This reverts commit dbbccdc4.
      
      It turns out that the "legacy" users aren't so legacy at all, and that
      turning off the legacy ioctl will break the current Qt bluetooth stack
      for bluetooth LE devices that were released just a couple of months ago.
      
      So it's simply not true that this was a legacy interface that hasn't
      been needed and is only limited to old legacy BT devices.  Because I
      actually read Kconfig help messages, and actively try to turn off
      features that I don't need, I turned the option off.
      
      Then I spent _way_ too much time debugging BLE issues until I realized
      that it wasn't the Qt and subsurface development that had broken one of
      my dive computer BLE downloads, but simply my broken kernel config.
      
      Maybe in a decade it will be true that this is a legacy interface.  And
      maybe with a better help-text and correct dependencies, this kind of
      legacy removal might be acceptable.  But as things are right now both
      the commit message and the Kconfig help text were misleading, and the
      Kconfig option had the wrong dependenencies.
      
      There's no reason to keep that broken Kconfig option in the tree.
      
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: Johan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e49aa15e
  3. 25 9月, 2017 1 次提交
  4. 23 9月, 2017 4 次提交
  5. 22 9月, 2017 13 次提交
  6. 21 9月, 2017 3 次提交
    • E
      net: change skb->mac_header when Generic XDP calls adjust_head · 92dd5452
      Edward Cree 提交于
      Since XDP's view of the packet includes the MAC header, moving the start-
       of-packet with bpf_xdp_adjust_head needs to also update the offset of the
       MAC header (which is relative to skb->head, not to the skb->data that was
       changed).
      Without this, tcpdump sees packets starting from the old MAC header rather
       than the new one, at least in my tests on the loopback device.
      
      Fixes: b5cdae32 ("net: Generic XDP")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92dd5452
    • M
      net: compat: assert the size of cmsg copied in is as expected · c2a64bb9
      Meng Xu 提交于
      The actual length of cmsg fetched in during the second loop
      (i.e., kcmsg - kcmsg_base) could be different from what we
      get from the first loop (i.e., kcmlen).
      
      The main reason is that the two get_user() calls in the two
      loops (i.e., get_user(ucmlen, &ucmsg->cmsg_len) and
      __get_user(ucmlen, &ucmsg->cmsg_len)) could cause ucmlen
      to have different values even they fetch from the same userspace
      address, as user can race to change the memory content in
      &ucmsg->cmsg_len across fetches.
      
      Although in the second loop, the sanity check
      if ((char *)kcmsg_base + kcmlen - (char *)kcmsg < CMSG_ALIGN(tmp))
      is inplace, it only ensures that the cmsg fetched in during the
      second loop does not exceed the length of kcmlen, but not
      necessarily equal to kcmlen. But indicated by the assignment
      kmsg->msg_controllen = kcmlen, we should enforce that.
      
      This patch adds this additional sanity check and ensures that
      what is recorded in kmsg->msg_controllen is the actual cmsg length.
      Signed-off-by: NMeng Xu <mengxu.gatech@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2a64bb9
    • W
      packet: hold bind lock when rebinding to fanout hook · 008ba2a1
      Willem de Bruijn 提交于
      Packet socket bind operations must hold the po->bind_lock. This keeps
      po->running consistent with whether the socket is actually on a ptype
      list to receive packets.
      
      fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
      binds the fanout object to receive through packet_rcv_fanout.
      
      Make it hold the po->bind_lock when testing po->running and rebinding.
      Else, it can race with other rebind operations, such as that in
      packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
      can result in a socket being added to a fanout group twice, causing
      use-after-free KASAN bug reports, among others.
      
      Reported independently by both trinity and syzkaller.
      Verified that the syzkaller reproducer passes after this patch.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Reported-by: Nnixioaming <nixiaoming@huawei.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      008ba2a1
  7. 20 9月, 2017 5 次提交
    • M
      ipv6: fix net.ipv6.conf.all interface DAD handlers · 35e015e1
      Matteo Croce 提交于
      Currently, writing into
      net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
      Fix handling of these flags by:
      
      - using the maximum of global and per-interface values for the
        accept_dad flag. That is, if at least one of the two values is
        non-zero, enable DAD on the interface. If at least one value is
        set to 2, enable DAD and disable IPv6 operation on the interface if
        MAC-based link-local address was found
      
      - using the logical OR of global and per-interface values for the
        optimistic_dad flag. If at least one of them is set to one, optimistic
        duplicate address detection (RFC 4429) is enabled on the interface
      
      - using the logical OR of global and per-interface values for the
        use_optimistic flag. If at least one of them is set to one,
        optimistic addresses won't be marked as deprecated during source address
        selection on the interface.
      
      While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
      drop inline, and let the compiler decide.
      
      Fixes: 7fd2561e ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35e015e1
    • M
      net: ipv6: fix regression of no RTM_DELADDR sent after DAD failure · 6819a14e
      Mike Manning 提交于
      Commit f784ad3d ("ipv6: do not send RTM_DELADDR for tentative
      addresses") incorrectly assumes that no RTM_NEWADDR are sent for
      addresses in tentative state, as this does happen for the standard
      IPv6 use-case of DAD failure, see the call to ipv6_ifa_notify() in
      addconf_dad_stop(). So as a result of this change, no RTM_DELADDR is
      sent after DAD failure for a link-local when strict DAD (accept_dad=2)
      is configured, or on the next admin down in other cases. The absence
      of this notification breaks backwards compatibility and causes problems
      after DAD failure if this notification was being relied on. The
      solution is to allow RTM_DELADDR to still be sent after DAD failure.
      
      Fixes: f784ad3d ("ipv6: do not send RTM_DELADDR for tentative addresses")
      Signed-off-by: NMike Manning <mmanning@brocade.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6819a14e
    • D
      bpf: fix ri->map_owner pointer on bpf_prog_realloc · 7c300131
      Daniel Borkmann 提交于
      Commit 109980b8 ("bpf: don't select potentially stale
      ri->map from buggy xdp progs") passed the pointer to the prog
      itself to be loaded into r4 prior on bpf_redirect_map() helper
      call, so that we can store the owner into ri->map_owner out of
      the helper.
      
      Issue with that is that the actual address of the prog is still
      subject to change when subsequent rewrites occur that require
      slow path in bpf_prog_realloc() to alloc more memory, e.g. from
      patching inlining helper functions or constant blinding. Thus,
      we really need to take prog->aux as the address we're holding,
      which also works with prog clones as they share the same aux
      object.
      
      Instead of then fetching aux->prog during runtime, which could
      potentially incur cache misses due to false sharing, we are
      going to just use aux for comparison on the map owner. This
      will also keep the patchlet of the same size, and later check
      in xdp_map_invalid() only accesses read-only aux pointer from
      the prog, it's also in the same cacheline already from prior
      access when calling bpf_func.
      
      Fixes: 109980b8 ("bpf: don't select potentially stale ri->map from buggy xdp progs")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c300131
    • E
      tcp: fastopen: fix on syn-data transmit failure · b5b7db8d
      Eric Dumazet 提交于
      Our recent change exposed a bug in TCP Fastopen Client that syzkaller
      found right away [1]
      
      When we prepare skb with SYN+DATA, we attempt to transmit it,
      and we update socket state as if the transmit was a success.
      
      In socket RTX queue we have two skbs, one with the SYN alone,
      and a second one containing the DATA.
      
      When (malicious) ACK comes in, we now complain that second one had no
      skb_mstamp.
      
      The proper fix is to make sure that if the transmit failed, we do not
      pretend we sent the DATA skb, and make it our send_head.
      
      When 3WHS completes, we can now send the DATA right away, without having
      to wait for a timeout.
      
      [1]
      WARNING: CPU: 0 PID: 100189 at net/ipv4/tcp_input.c:3117 tcp_clean_rtx_queue+0x2057/0x2ab0 net/ipv4/tcp_input.c:3117()
      
       WARN_ON_ONCE(last_ackt == 0);
      
      Modules linked in:
      CPU: 0 PID: 100189 Comm: syz-executor1 Not tainted
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       0000000000000000 ffff8800b35cb1d8 ffffffff81cad00d 0000000000000000
       ffffffff828a4347 ffff88009f86c080 ffffffff8316eb20 0000000000000d7f
       ffff8800b35cb220 ffffffff812c33c2 ffff8800baad2440 00000009d46575c0
      Call Trace:
       [<ffffffff81cad00d>] __dump_stack
       [<ffffffff81cad00d>] dump_stack+0xc1/0x124
       [<ffffffff812c33c2>] warn_slowpath_common+0xe2/0x150
       [<ffffffff812c361e>] warn_slowpath_null+0x2e/0x40
       [<ffffffff828a4347>] tcp_clean_rtx_queue+0x2057/0x2ab0 n
       [<ffffffff828ae6fd>] tcp_ack+0x151d/0x3930
       [<ffffffff828baa09>] tcp_rcv_state_process+0x1c69/0x4fd0
       [<ffffffff828efb7f>] tcp_v4_do_rcv+0x54f/0x7c0
       [<ffffffff8258aacb>] sk_backlog_rcv
       [<ffffffff8258aacb>] __release_sock+0x12b/0x3a0
       [<ffffffff8258ad9e>] release_sock+0x5e/0x1c0
       [<ffffffff8294a785>] inet_wait_for_connect
       [<ffffffff8294a785>] __inet_stream_connect+0x545/0xc50
       [<ffffffff82886f08>] tcp_sendmsg_fastopen
       [<ffffffff82886f08>] tcp_sendmsg+0x2298/0x35a0
       [<ffffffff82952515>] inet_sendmsg+0xe5/0x520
       [<ffffffff8257152f>] sock_sendmsg_nosec
       [<ffffffff8257152f>] sock_sendmsg+0xcf/0x110
      
      Fixes: 8c72c65b ("tcp: update skb->skb_mstamp more carefully")
      Fixes: 783237e8 ("net-tcp: Fast Open client - sending SYN-data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5b7db8d
    • I
      libceph: don't allow bidirectional swap of pg-upmap-items · 29a0cfbf
      Ilya Dryomov 提交于
      This reverts most of commit f53b7665 ("libceph: upmap semantic
      changes").
      
      We need to prevent duplicates in the final result.  For example, we
      can currently take
      
        [1,2,3] and apply [(1,2)] and get [2,2,3]
      
      or
      
        [1,2,3] and apply [(3,2)] and get [1,2,2]
      
      The rest of the system is not prepared to handle duplicates in the
      result set like this.
      
      The reverted piece was intended to allow
      
        [1,2,3] and [(1,2),(2,1)] to get [2,1,3]
      
      to reorder primaries.  First, this bidirectional swap is hard to
      implement in a way that also prevents dups.  For example, [1,2,3] and
      [(1,4),(2,3),(3,4)] would give [4,3,4] but would we just drop the last
      step we'd have [4,3,3] which is also invalid, etc.  Simpler to just not
      handle bidirectional swaps.  In practice, they are not needed: if you
      just want to choose a different primary then use primary_affinity, or
      pg_upmap (not pg_upmap_items).
      
      Cc: stable@vger.kernel.org # 4.13
      Link: http://tracker.ceph.com/issues/21410Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      29a0cfbf
  8. 19 9月, 2017 6 次提交
    • Y
      tcp: remove two unused functions · 4c712441
      Yuchung Cheng 提交于
      remove tcp_may_send_now and tcp_snd_test that are no longer used
      
      Fixes: 840a3cbe ("tcp: remove forward retransmit feature")
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c712441
    • D
      net/sched: cls_matchall: fix crash when used with classful qdisc · 3ff4cbec
      Davide Caratti 提交于
      this script, edited from Linux Advanced Routing and Traffic Control guide
      
      tc q a dev en0 root handle 1: htb default a
      tc c a dev en0 parent 1:  classid 1:1 htb rate 6mbit burst 15k
      tc c a dev en0 parent 1:1 classid 1:a htb rate 5mbit ceil 6mbit burst 15k
      tc c a dev en0 parent 1:1 classid 1:b htb rate 1mbit ceil 6mbit burst 15k
      tc f a dev en0 parent 1:0 prio 1 $clsname $clsargs classid 1:b
      ping $address -c1
      tc -s c s dev en0
      
      classifies traffic to 1:b or 1:a, depending on whether the packet matches
      or not the pattern $clsargs of filter $clsname. However, when $clsname is
      'matchall', a systematic crash can be observed in htb_classify(). HTB and
      classful qdiscs don't assign initial value to struct tcf_result, but then
      they expect it to contain valid values after filters have been run. Thus,
      current 'matchall' ignores the TCA_MATCHALL_CLASSID attribute, configured
      by user, and makes HTB (and classful qdiscs) dereference random pointers.
      
      By assigning head->res to *res in mall_classify(), before the actions are
      invoked, we fix this crash and enable TCA_MATCHALL_CLASSID functionality,
      that had no effect on 'matchall' classifier since its first introduction.
      
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1460213Reported-by: NJiri Benc <jbenc@redhat.com>
      Fixes: b87f7936 ("net/sched: introduce Match-all classifier")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ff4cbec
    • X
      ip6_tunnel: do not allow loading ip6_tunnel if ipv6 is disabled in cmdline · 8c22dab0
      Xin Long 提交于
      If ipv6 has been disabled from cmdline since kernel started, it makes
      no sense to allow users to create any ip6 tunnel. Otherwise, it could
      some potential problem.
      
      Jianlin found a kernel crash caused by this in ip6_gre when he set
      ipv6.disable=1 in grub:
      
      [  209.588865] Unable to handle kernel paging request for data at address 0x00000080
      [  209.588872] Faulting instruction address: 0xc000000000a3aa6c
      [  209.588879] Oops: Kernel access of bad area, sig: 11 [#1]
      [  209.589062] NIP [c000000000a3aa6c] fib_rules_lookup+0x4c/0x260
      [  209.589071] LR [c000000000b9ad90] fib6_rule_lookup+0x50/0xb0
      [  209.589076] Call Trace:
      [  209.589097] fib6_rule_lookup+0x50/0xb0
      [  209.589106] rt6_lookup+0xc4/0x110
      [  209.589116] ip6gre_tnl_link_config+0x214/0x2f0 [ip6_gre]
      [  209.589125] ip6gre_newlink+0x138/0x3a0 [ip6_gre]
      [  209.589134] rtnl_newlink+0x798/0xb80
      [  209.589142] rtnetlink_rcv_msg+0xec/0x390
      [  209.589151] netlink_rcv_skb+0x138/0x150
      [  209.589159] rtnetlink_rcv+0x48/0x70
      [  209.589169] netlink_unicast+0x538/0x640
      [  209.589175] netlink_sendmsg+0x40c/0x480
      [  209.589184] ___sys_sendmsg+0x384/0x4e0
      [  209.589194] SyS_sendmsg+0xd4/0x140
      [  209.589201] SyS_socketcall+0x3e0/0x4f0
      [  209.589209] system_call+0x38/0xe0
      
      This patch is to return -EOPNOTSUPP in ip6_tunnel_init if ipv6 has been
      disabled from cmdline.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c22dab0
    • X
      ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header · 76cc0d32
      Xin Long 提交于
      Now in ip6gre_header before packing the ipv6 header, it skb_push t->hlen
      which only includes encap_hlen + tun_hlen. It means greh and inner header
      would be over written by ipv6 stuff and ipv6h might have no chance to set
      up.
      
      Jianlin found this issue when using remote any on ip6_gre, the packets he
      captured on gre dev are truncated:
      
      22:50:26.210866 Out ethertype IPv6 (0x86dd), length 120: truncated-ip6 -\
      8128 bytes missing!(flowlabel 0x92f40, hlim 0, next-header Options (0)  \
      payload length: 8192) ::1:2000:0 > ::1:0:86dd: HBH [trunc] ip-proto-128 \
      8184
      
      It should also skb_push ipv6hdr so that ipv6h points to the right position
      to set ipv6 stuff up.
      
      This patch is to skb_push hlen + sizeof(*ipv6h) and also fix some indents
      in ip6gre_header.
      
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76cc0d32
    • J
      nl80211: fix null-ptr dereference on invalid mesh configuration · 265698d7
      Johannes Berg 提交于
      If TX rates are specified during mesh join, the channel must
      also be specified. Check the channel pointer to avoid a null
      pointer dereference if it isn't.
      Reported-by: NJouni Malinen <j@w1.fi>
      Fixes: 8564e382 ("cfg80211: add checks for beacon rate, extend to mesh")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      265698d7
    • S
      udpv6: Fix the checksum computation when HW checksum does not apply · 63ecc3d9
      Subash Abhinov Kasiviswanathan 提交于
      While trying an ESP transport mode encryption for UDPv6 packets of
      datagram size 1436 with MTU 1500, checksum error was observed in
      the secondary fragment.
      
      This error occurs due to the UDP payload checksum being missed out
      when computing the full checksum for these packets in
      udp6_hwcsum_outgoing().
      
      Fixes: d39d938c ("ipv6: Introduce udpv6_send_skb()")
      Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63ecc3d9
  9. 18 9月, 2017 2 次提交
  10. 17 9月, 2017 1 次提交
  11. 16 9月, 2017 3 次提交
    • X
      sctp: do not mark sk dumped when inet_sctp_diag_fill returns err · 8c7c19a5
      Xin Long 提交于
      sctp_diag would not actually dump out sk/asoc if inet_sctp_diag_fill
      returns err, in which case it shouldn't mark sk dumped by setting
      cb->args[3] as 1 in sctp_sock_dump().
      
      Otherwise, it could cause some asocs to have no parent's sk dumped
      in 'ss --sctp'.
      
      So this patch is to not set cb->args[3] when inet_sctp_diag_fill()
      returns err in sctp_sock_dump().
      
      Fixes: 8f840e47 ("sctp: add the sctp_diag.c file")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c7c19a5
    • X
      sctp: fix an use-after-free issue in sctp_sock_dump · d25adbeb
      Xin Long 提交于
      Commit 86fdb344 ("sctp: ensure ep is not destroyed before doing the
      dump") tried to fix an use-after-free issue by checking !sctp_sk(sk)->ep
      with holding sock and sock lock.
      
      But Paolo noticed that endpoint could be destroyed in sctp_rcv without
      sock lock protection. It means the use-after-free issue still could be
      triggered when sctp_rcv put and destroy ep after sctp_sock_dump checks
      !ep, although it's pretty hard to reproduce.
      
      I could reproduce it by mdelay in sctp_rcv while msleep in sctp_close
      and sctp_sock_dump long time.
      
      This patch is to add another param cb_done to sctp_for_each_transport
      and dump ep->assocs with holding tsp after jumping out of transport's
      traversal in it to avoid this issue.
      
      It can also improve sctp diag dump to make it run faster, as no need
      to save sk into cb->args[5] and keep calling sctp_for_each_transport
      any more.
      
      This patch is also to use int * instead of int for the pos argument
      in sctp_for_each_transport, which could make postion increment only
      in sctp_for_each_transport and no need to keep changing cb->args[2]
      in sctp_sock_filter and sctp_sock_dump any more.
      
      Fixes: 86fdb344 ("sctp: ensure ep is not destroyed before doing the dump")
      Reported-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d25adbeb
    • E
      tcp: update skb->skb_mstamp more carefully · 8c72c65b
      Eric Dumazet 提交于
      liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
      in tcp_probe_timer() :
            https://www.spinics.net/lists/netdev/msg454496.html
      
      After investigations, the root cause of the problem is that we update
      skb->skb_mstamp of skbs in write queue, even if the attempt to send a
      clone or copy of it failed. One reason being a routing problem.
      
      This patch prevents this, solving liujian issue.
      
      It also removes a potential RTT miscalculation, since
      __tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
      TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
      been changed.
      
      A future ACK would then lead to a very small RTT sample and min_rtt
      would then be lowered to this too small value.
      
      Tested:
      
      # cat user_timeout.pkt
      --local_ip=192.168.102.64
      
          0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
         +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
         +0 bind(3, ..., ...) = 0
         +0 listen(3, 1) = 0
      
         +0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0`
      
         +0 < S 0:0(0) win 0 <mss 1460>
         +0 > S. 0:0(0) ack 1 <mss 1460>
      
        +.1 < . 1:1(0) ack 1 win 65530
         +0 accept(3, ..., ...) = 4
      
         +0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0
         +0 write(4, ..., 24) = 24
         +0 > P. 1:25(24) ack 1 win 29200
         +.1 < . 1:1(0) ack 25 win 65530
      
      //change the ipaddress
         +1 `ifconfig tun0 192.168.0.10/16`
      
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
      
         +0 `ifconfig tun0 192.168.102.64/16`
         +0 < . 1:2(1) ack 25 win 65530
         +0 `ifconfig tun0 192.168.0.10/16`
      
         +3 write(4, ..., 24) = -1
      
      # ./packetdrill user_timeout.pkt
      Signed-off-by: NEric Dumazet <edumazet@googl.com>
      Reported-by: Nliujian <liujian56@huawei.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c72c65b