1. 21 12月, 2019 7 次提交
    • E
      tcp: md5: fix potential overestimation of TCP option space · a148815a
      Eric Dumazet 提交于
      [ Upstream commit 9424e2e7ad93ffffa88f882c9bc5023570904b55 ]
      
      Back in 2008, Adam Langley fixed the corner case of packets for flows
      having all of the following options : MD5 TS SACK
      
      Since MD5 needs 20 bytes, and TS needs 12 bytes, no sack block
      can be cooked from the remaining 8 bytes.
      
      tcp_established_options() correctly sets opts->num_sack_blocks
      to zero, but returns 36 instead of 32.
      
      This means TCP cooks packets with 4 extra bytes at the end
      of options, containing unitialized bytes.
      
      Fixes: 33ad798c ("tcp: options clean up")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a148815a
    • A
      openvswitch: support asymmetric conntrack · 6f99afcc
      Aaron Conole 提交于
      [ Upstream commit 5d50aa83e2c8e91ced2cca77c198b468ca9210f4 ]
      
      The openvswitch module shares a common conntrack and NAT infrastructure
      exposed via netfilter.  It's possible that a packet needs both SNAT and
      DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
      this because it runs through the NAT table twice - once on ingress and
      again after egress.  The openvswitch module doesn't have such capability.
      
      Like netfilter hook infrastructure, we should run through NAT twice to
      keep the symmetry.
      
      Fixes: 05752523 ("openvswitch: Interface with NAT.")
      Signed-off-by: NAaron Conole <aconole@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f99afcc
    • D
      net: sched: fix dump qlen for sch_mq/sch_mqprio with NOLOCK subqueues · 0c5a4dd6
      Dust Li 提交于
      [ Upstream commit 2f23cd42e19c22c24ff0e221089b7b6123b117c5 ]
      
      sch->q.len hasn't been set if the subqueue is a NOLOCK qdisc
       in mq_dump() and mqprio_dump().
      
      Fixes: ce679e8d ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio")
      Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c5a4dd6
    • A
      net: dsa: fix flow dissection on Tx path · a7d80e75
      Alexander Lobakin 提交于
      [ Upstream commit 8bef0af09a5415df761b04fa487a6c34acae74bc ]
      
      Commit 43e66528 ("net-next: dsa: fix flow dissection") added an
      ability to override protocol and network offset during flow dissection
      for DSA-enabled devices (i.e. controllers shipped as switch CPU ports)
      in order to fix skb hashing for RPS on Rx path.
      
      However, skb_hash() and added part of code can be invoked not only on
      Rx, but also on Tx path if we have a multi-queued device and:
       - kernel is running on UP system or
       - XPS is not configured.
      
      The call stack in this two cases will be like: dev_queue_xmit() ->
      __dev_queue_xmit() -> netdev_core_pick_tx() -> netdev_pick_tx() ->
      skb_tx_hash() -> skb_get_hash().
      
      The problem is that skbs queued for Tx have both network offset and
      correct protocol already set up even after inserting a CPU tag by DSA
      tagger, so calling tag_ops->flow_dissect() on this path actually only
      breaks flow dissection and hashing.
      
      This can be observed by adding debug prints just before and right after
      tag_ops->flow_dissect() call to the related block of code:
      
      Before the patch:
      
      Rx path (RPS):
      
      [   19.240001] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   19.244271] tag_ops->flow_dissect()
      [   19.247811] Rx: proto: 0x0800, nhoff: 8	/* ETH_P_IP */
      
      [   19.215435] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   19.219746] tag_ops->flow_dissect()
      [   19.223241] Rx: proto: 0x0806, nhoff: 8	/* ETH_P_ARP */
      
      [   18.654057] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   18.658332] tag_ops->flow_dissect()
      [   18.661826] Rx: proto: 0x8100, nhoff: 8	/* ETH_P_8021Q */
      
      Tx path (UP system):
      
      [   18.759560] Tx: proto: 0x0800, nhoff: 26	/* ETH_P_IP */
      [   18.763933] tag_ops->flow_dissect()
      [   18.767485] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      [   22.800020] Tx: proto: 0x0806, nhoff: 26	/* ETH_P_ARP */
      [   22.804392] tag_ops->flow_dissect()
      [   22.807921] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      [   16.898342] Tx: proto: 0x86dd, nhoff: 26	/* ETH_P_IPV6 */
      [   16.902705] tag_ops->flow_dissect()
      [   16.906227] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      After:
      
      Rx path (RPS):
      
      [   16.520993] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   16.525260] tag_ops->flow_dissect()
      [   16.528808] Rx: proto: 0x0800, nhoff: 8	/* ETH_P_IP */
      
      [   15.484807] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   15.490417] tag_ops->flow_dissect()
      [   15.495223] Rx: proto: 0x0806, nhoff: 8	/* ETH_P_ARP */
      
      [   17.134621] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   17.138895] tag_ops->flow_dissect()
      [   17.142388] Rx: proto: 0x8100, nhoff: 8	/* ETH_P_8021Q */
      
      Tx path (UP system):
      
      [   15.499558] Tx: proto: 0x0800, nhoff: 26	/* ETH_P_IP */
      
      [   20.664689] Tx: proto: 0x0806, nhoff: 26	/* ETH_P_ARP */
      
      [   18.565782] Tx: proto: 0x86dd, nhoff: 26	/* ETH_P_IPV6 */
      
      In order to fix that we can add the check 'proto == htons(ETH_P_XDSA)'
      to prevent code from calling tag_ops->flow_dissect() on Tx.
      I also decided to initialize 'offset' variable so tagger callbacks can
      now safely leave it untouched without provoking a chaos.
      
      Fixes: 43e66528 ("net-next: dsa: fix flow dissection")
      Signed-off-by: NAlexander Lobakin <alobakin@dlink.ru>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7d80e75
    • N
      net: bridge: deny dev_set_mac_address() when unregistering · bb168ebe
      Nikolay Aleksandrov 提交于
      [ Upstream commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 ]
      
      We have an interesting memory leak in the bridge when it is being
      unregistered and is a slave to a master device which would change the
      mac of its slaves on unregister (e.g. bond, team). This is a very
      unusual setup but we do end up leaking 1 fdb entry because
      dev_set_mac_address() would cause the bridge to insert the new mac address
      into its table after all fdbs are flushed, i.e. after dellink() on the
      bridge has finished and we call NETDEV_UNREGISTER the bond/team would
      release it and will call dev_set_mac_address() to restore its original
      address and that in turn will add an fdb in the bridge.
      One fix is to check for the bridge dev's reg_state in its
      ndo_set_mac_address callback and return an error if the bridge is not in
      NETREG_REGISTERED.
      
      Easy steps to reproduce:
       1. add bond in mode != A/B
       2. add any slave to the bond
       3. add bridge dev as a slave to the bond
       4. destroy the bridge device
      
      Trace:
       unreferenced object 0xffff888035c4d080 (size 128):
         comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
         hex dump (first 32 bytes):
           41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00  A..6............
           d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00  ...^?...........
         backtrace:
           [<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f
           [<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge]
           [<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge]
           [<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge]
           [<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge]
           [<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge]
           [<000000006846a77f>] dev_set_mac_address+0x63/0x9b
           [<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding]
           [<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding]
           [<00000000305d7795>] notifier_call_chain+0x38/0x56
           [<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23
           [<000000008279477b>] rollback_registered_many+0x353/0x6a4
           [<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f
           [<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43
           [<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a
           [<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268
      
      Fixes: 43598813 ("bridge: add local MAC address to forwarding table (v2)")
      Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb168ebe
    • V
      mqprio: Fix out-of-bounds access in mqprio_dump · 588fac83
      Vladyslav Tarasiuk 提交于
      [ Upstream commit 9f104c7736904ac72385bbb48669e0c923ca879b ]
      
      When user runs a command like
      tc qdisc add dev eth1 root mqprio
      KASAN stack-out-of-bounds warning is emitted.
      Currently, NLA_ALIGN macro used in mqprio_dump provides too large
      buffer size as argument for nla_put and memcpy down the call stack.
      The flow looks like this:
      1. nla_put expects exact object size as an argument;
      2. Later it provides this size to memcpy;
      3. To calculate correct padding for SKB, nla_put applies NLA_ALIGN
         macro itself.
      
      Therefore, NLA_ALIGN should not be applied to the nla_put parameter.
      Otherwise it will lead to out-of-bounds memory access in memcpy.
      
      Fixes: 4e8b86c0 ("mqprio: Introduce new hardware offload mode and shaper in mqprio")
      Signed-off-by: NVladyslav Tarasiuk <vladyslavt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      588fac83
    • E
      inet: protect against too small mtu values. · d80d67cd
      Eric Dumazet 提交于
      [ Upstream commit 501a90c945103e8627406763dac418f20f3837b2 ]
      
      syzbot was once again able to crash a host by setting a very small mtu
      on loopback device.
      
      Let's make inetdev_valid_mtu() available in include/net/ip.h,
      and use it in ip_setup_cork(), so that we protect both ip_append_page()
      and __ip_append_data()
      
      Also add a READ_ONCE() when the device mtu is read.
      
      Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
      even if other code paths might write over this field.
      
      Add a big comment in include/linux/netdevice.h about dev->mtu
      needing READ_ONCE()/WRITE_ONCE() annotations.
      
      Hopefully we will add the missing ones in followup patches.
      
      [1]
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       panic+0x2e3/0x75c kernel/panic.c:221
       __warn.cold+0x2f/0x3e kernel/panic.c:582
       report_bug+0x289/0x300 lib/bug.c:195
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       fixup_bug arch/x86/kernel/traps.c:169 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
       invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
      RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
      RSP: 0018:ffff88809689f550 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
      RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
      R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
      R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
       refcount_add include/linux/refcount.h:193 [inline]
       skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
       sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
       ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
       udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
       inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
       kernel_sendpage+0x92/0xf0 net/socket.c:3794
       sock_sendpage+0x8b/0xc0 net/socket.c:936
       pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
       splice_from_pipe+0x108/0x170 fs/splice.c:671
       generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
       do_splice_from fs/splice.c:861 [inline]
       direct_splice_actor+0x123/0x190 fs/splice.c:1035
       splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441409
      Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
      RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
      RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
      R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
      R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 1470ddf7 ("inet: Remove explicit write references to sk/inet in ip_append_data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d80d67cd
  2. 18 12月, 2019 4 次提交
    • P
      sunrpc: fix crash when cache_head become valid before update · 67256225
      Pavel Tikhomirov 提交于
      [ Upstream commit 5fcaf6982d1167f1cd9b264704f6d1ef4c505d54 ]
      
      I was investigating a crash in our Virtuozzo7 kernel which happened in
      in svcauth_unix_set_client. I found out that we access m_client field
      in ip_map structure, which was received from sunrpc_cache_lookup (we
      have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
      these field looks uninitialized (m_client == 0x74 don't look like a
      pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.
      
      It looks like the problem appeared from our previous fix to sunrpc (1):
      commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
      request")
      
      And we've also found a patch already fixing our patch (2):
      commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
      
      Though the crash is eliminated, I think the core of the problem is not
      completely fixed:
      
      Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
      cache_fresh_locked which was added in (1) to fix crash. These way
      cache_is_valid won't say the cache is valid anymore and in
      svcauth_unix_set_client the function cache_check will return error
      instead of 0, and we don't count entry as initialized.
      
      But it looks like we need to remove cache_fresh_locked completely in
      sunrpc_cache_lookup:
      
      In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
      that cache_requests with no readers also release corresponding
      cache_head, to fix their leak.  We with Vasily were not sure if
      cache_fresh_locked and cache_fresh_unlocked should be used in pair or
      not, so we've guessed to use them in pair.
      
      Now we see that we don't want the CACHE_VALID bit set here by
      cache_fresh_locked, as "valid" means "initialized" and there is no
      initialization in sunrpc_cache_add_entry. Both expiry_time and
      last_refresh are not used in cache_fresh_unlocked code-path and also not
      required for the initial fix.
      
      So to conclude cache_fresh_locked was called by mistake, and we can just
      safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
      ideologically better for me. Hope I don't miss something here.
      
      Here is our crash backtrace:
      [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
      [13108726.326365] IP: [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.326448] PGD 0
      [13108726.326468] Oops: 0002 [#1] SMP
      [13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
      [13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat
      [13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1]
      [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G        W  O   ------------   3.10.0-862.20.2.vz7.73.29 #1 73.29
      [13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
      [13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000
      [13108726.328610] RIP: 0010:[<ffffffffc01f79eb>]  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.328706] RSP: 0018:ffffa0c2a74bfd80  EFLAGS: 00010246
      [13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000
      [13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0
      [13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001
      [13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000
      [13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4
      [13108726.329022] FS:  0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000
      [13108726.329081] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0
      [13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [13108726.338908] PKRU: 00000000
      [13108726.341047] Call Trace:
      [13108726.343074]  [<ffffffff8a2c78b4>] ? groups_alloc+0x34/0x110
      [13108726.344837]  [<ffffffffc01f5eb4>] svc_set_client+0x24/0x30 [sunrpc]
      [13108726.346631]  [<ffffffffc01f2ac1>] svc_process_common+0x241/0x710 [sunrpc]
      [13108726.348332]  [<ffffffffc01f3093>] svc_process+0x103/0x190 [sunrpc]
      [13108726.350016]  [<ffffffffc07d605f>] nfsd+0xdf/0x150 [nfsd]
      [13108726.351735]  [<ffffffffc07d5f80>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [13108726.353459]  [<ffffffff8a2bf741>] kthread+0xd1/0xe0
      [13108726.355195]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.356896]  [<ffffffff8a9556dd>] ret_from_fork_nospec_begin+0x7/0x21
      [13108726.358577]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 <f0> 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38
      [13108726.363769] RIP  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.365530]  RSP <ffffa0c2a74bfd80>
      [13108726.367179] CR2: 0000000000000074
      
      Fixes: d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
      Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      67256225
    • C
      gre: refetch erspan header from skb->data after pskb_may_pull() · f7654ebe
      Cong Wang 提交于
      [ Upstream commit 0e4940928c26527ce8f97237fef4c8a91cd34207 ]
      
      After pskb_may_pull() we should always refetch the header
      pointers from the skb->data in case it got reallocated.
      
      In gre_parse_header(), the erspan header is still fetched
      from the 'options' pointer which is fetched before
      pskb_may_pull().
      
      Found this during code review of a KMSAN bug report.
      
      Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
      Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f7654ebe
    • K
      net/smc: do not wait under send_lock · 82060d93
      Karsten Graul 提交于
      [ Upstream commit 33f3fcc290671590821ff3c0c9396db1ec9b7d4c ]
      
      smc_cdc_get_free_slot() might wait for free transfer buffers when using
      SMC-R. This wait should not be done under the send_lock, which is a
      spin_lock. This fixes a cpu loop in parallel threads waiting for the
      send_lock.
      Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      82060d93
    • T
      sch_cake: Correctly update parent qlen when splitting GSO packets · 146f563f
      Toke Høiland-Jørgensen 提交于
      [ Upstream commit 8c6c37fdc20ec9ffaa342f827a8e20afe736fb0c ]
      
      To ensure parent qdiscs have the same notion of the number of enqueued
      packets even after splitting a GSO packet, update the qdisc tree with the
      number of packets that was added due to the split.
      Reported-by: NPete Heist <pete@heistp.net>
      Tested-by: NPete Heist <pete@heistp.net>
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      146f563f
  3. 13 12月, 2019 19 次提交
  4. 05 12月, 2019 10 次提交
    • Y
      tcp: exit if nothing to retransmit on RTO timeout · d832269a
      Yuchung Cheng 提交于
      commit 88f8598d0a302a08380eadefd09b9f5cb1c4c428 upstream.
      
      Previously TCP only warns if its RTO timer fires and the
      retransmission queue is empty, but it'll cause null pointer
      reference later on. It's better to avoid such catastrophic failure
      and simply exit with a warning.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NNeal Cardwell <ncardwell@google.com>
      Reviewed-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d832269a
    • D
      net: sched: fix `tc -s class show` no bstats on class with nolock subqueues · 50ee7a49
      Dust Li 提交于
      [ Upstream commit 14e54ab9143fa60794d13ea0a66c792a2046a8f3 ]
      
      When a classful qdisc's child qdisc has set the flag
      TCQ_F_CPUSTATS (pfifo_fast for example), the child qdisc's
      cpu_bstats should be passed to gnet_stats_copy_basic(),
      but many classful qdisc didn't do that. As a result,
      `tc -s class show dev DEV` always return 0 for bytes and
      packets in this case.
      
      Pass the child qdisc's cpu_bstats to gnet_stats_copy_basic()
      to fix this issue.
      
      The qstats also has this problem, but it has been fixed
      in 5dd431b6b9 ("net: sched: introduce and use qstats read...")
      and bstats still remains buggy.
      
      Fixes: 22e0f8b9 ("net: sched: make bstats per cpu and estimator RCU safe")
      Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50ee7a49
    • X
      sctp: cache netns in sctp_ep_common · 1bef71f5
      Xin Long 提交于
      [ Upstream commit 312434617cb16be5166316cf9d08ba760b1042a1 ]
      
      This patch is to fix a data-race reported by syzbot:
      
        BUG: KCSAN: data-race in sctp_assoc_migrate / sctp_hash_obj
      
        write to 0xffff8880b67c0020 of 8 bytes by task 18908 on cpu 1:
          sctp_assoc_migrate+0x1a6/0x290 net/sctp/associola.c:1091
          sctp_sock_migrate+0x8aa/0x9b0 net/sctp/socket.c:9465
          sctp_accept+0x3c8/0x470 net/sctp/socket.c:4916
          inet_accept+0x7f/0x360 net/ipv4/af_inet.c:734
          __sys_accept4+0x224/0x430 net/socket.c:1754
          __do_sys_accept net/socket.c:1795 [inline]
          __se_sys_accept net/socket.c:1792 [inline]
          __x64_sys_accept+0x4e/0x60 net/socket.c:1792
          do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        read to 0xffff8880b67c0020 of 8 bytes by task 12003 on cpu 0:
          sctp_hash_obj+0x4f/0x2d0 net/sctp/input.c:894
          rht_key_get_hash include/linux/rhashtable.h:133 [inline]
          rht_key_hashfn include/linux/rhashtable.h:159 [inline]
          rht_head_hashfn include/linux/rhashtable.h:174 [inline]
          head_hashfn lib/rhashtable.c:41 [inline]
          rhashtable_rehash_one lib/rhashtable.c:245 [inline]
          rhashtable_rehash_chain lib/rhashtable.c:276 [inline]
          rhashtable_rehash_table lib/rhashtable.c:316 [inline]
          rht_deferred_worker+0x468/0xab0 lib/rhashtable.c:420
          process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
          worker_thread+0xa0/0x800 kernel/workqueue.c:2415
          kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
          ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      It was caused by rhashtable access asoc->base.sk when sctp_assoc_migrate
      is changing its value. However, what rhashtable wants is netns from asoc
      base.sk, and for an asoc, its netns won't change once set. So we can
      simply fix it by caching netns since created.
      
      Fixes: d6c0256a ("sctp: add the rhashtable apis for sctp global transport hashtable")
      Reported-by: syzbot+e3b35fe7918ff0ee474e@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bef71f5
    • J
      tipc: fix link name length check · 2e849d71
      John Rutherford 提交于
      [ Upstream commit fd567ac20cb0377ff466d3337e6e9ac5d0cb15e4 ]
      
      In commit 4f07b80c9733 ("tipc: check msg->req data len in
      tipc_nl_compat_bearer_disable") the same patch code was copied into
      routines: tipc_nl_compat_bearer_disable(),
      tipc_nl_compat_link_stat_dump() and tipc_nl_compat_link_reset_stats().
      The two link routine occurrences should have been modified to check
      the maximum link name length and not bearer name length.
      
      Fixes: 4f07b80c9733 ("tipc: check msg->reg data len in tipc_nl_compat_bearer_disable")
      Signed-off-by: NJohn Rutherford <john.rutherford@dektech.com.au>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e849d71
    • P
      openvswitch: remove another BUG_ON() · 6ed8f48e
      Paolo Abeni 提交于
      [ Upstream commit 8a574f86652a4540a2433946ba826ccb87f398cc ]
      
      If we can't build the flow del notification, we can simply delete
      the flow, no need to crash the kernel. Still keep a WARN_ON to
      preserve debuggability.
      
      Note: the BUG_ON() predates the Fixes tag, but this change
      can be applied only after the mentioned commit.
      
      v1 -> v2:
       - do not leak an skb on error
      
      Fixes: aed06778 ("openvswitch: Minimize ovs_flow_cmd_del critical section.")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ed8f48e
    • P
      openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info() · 7e48458f
      Paolo Abeni 提交于
      [ Upstream commit 8ffeb03fbba3b599690b361467bfd2373e8c450f ]
      
      All the callers of ovs_flow_cmd_build_info() already deal with
      error return code correctly, so we can handle the error condition
      in a more gracefull way. Still dump a warning to preserve
      debuggability.
      
      v1 -> v2:
       - clarify the commit message
       - clean the skb and report the error (DaveM)
      
      Fixes: ccb1352e ("net: Add Open vSwitch kernel components.")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e48458f
    • N
      sctp: Fix memory leak in sctp_sf_do_5_2_4_dupcook · 681e0849
      Navid Emamdoost 提交于
      [ Upstream commit b6631c6031c746ed004c4221ec0616d7a520f441 ]
      
      In the implementation of sctp_sf_do_5_2_4_dupcook() the allocated
      new_asoc is leaked if security_sctp_assoc_request() fails. Release it
      via sctp_association_free().
      
      Fixes: 2277c7cd ("sctp: Add LSM hooks")
      Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      681e0849
    • P
      openvswitch: fix flow command message size · ae0ab395
      Paolo Abeni 提交于
      [ Upstream commit 4e81c0b3fa93d07653e2415fa71656b080a112fd ]
      
      When user-space sets the OVS_UFID_F_OMIT_* flags, and the relevant
      flow has no UFID, we can exceed the computed size, as
      ovs_nla_put_identifier() will always dump an OVS_FLOW_ATTR_KEY
      attribute.
      Take the above in account when computing the flow command message
      size.
      
      Fixes: 74ed7ab9 ("openvswitch: Add support for unique flow IDs.")
      Reported-by: NQi Jun Ding <qding@redhat.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae0ab395
    • N
      net: psample: fix skb_over_panic · 4f8bd02b
      Nikolay Aleksandrov 提交于
      [ Upstream commit 7eb9d7675c08937cd11d32b0b40442d4d731c5ee ]
      
      We need to calculate the skb size correctly otherwise we risk triggering
      skb_over_panic[1]. The issue is that data_len is added to the skb in a
      nl attribute, but we don't account for its header size (nlattr 4 bytes)
      and alignment. We account for it when calculating the total size in
      the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when
      allocating after that. The fix is simple - use nla_total_size() for
      data_len when allocating.
      
      To reproduce:
       $ tc qdisc add dev eth1 clsact
       $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129
       $ mausezahn eth1 -b bcast -a rand -c 1 -p 129
       < skb_over_panic BUG(), tail is 4 bytes past skb->end >
      
      [1] Trace:
       [   50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:<NULL>
       [   50.474339][ T3480] ------------[ cut here ]------------
       [   50.481132][ T3480] kernel BUG at net/core/skbuff.c:108!
       [   50.486059][ T3480] invalid opcode: 0000 [#1] PREEMPT SMP
       [   50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 #108
       [   50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
       [   50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b
       [   50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe <0f> 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22
       [   50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282
       [   50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000
       [   50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff
       [   50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000
       [   50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0
       [   50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084
       [   50.515222][ T3480] FS:  00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000
       [   50.517135][ T3480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [   50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0
       [   50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [   50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [   50.524450][ T3480] Call Trace:
       [   50.525214][ T3480]  skb_put.cold+0x1b/0x1b
       [   50.526171][ T3480]  psample_sample_packet+0x1d3/0x340
       [   50.527307][ T3480]  tcf_sample_act+0x178/0x250
       [   50.528339][ T3480]  tcf_action_exec+0xb1/0x190
       [   50.529354][ T3480]  mall_classify+0x67/0x90
       [   50.530332][ T3480]  tcf_classify+0x72/0x160
       [   50.531286][ T3480]  __dev_queue_xmit+0x3db/0xd50
       [   50.532327][ T3480]  dev_queue_xmit+0x18/0x20
       [   50.533299][ T3480]  packet_sendmsg+0xee7/0x2090
       [   50.534331][ T3480]  sock_sendmsg+0x54/0x70
       [   50.535271][ T3480]  __sys_sendto+0x148/0x1f0
       [   50.536252][ T3480]  ? tomoyo_file_ioctl+0x23/0x30
       [   50.537334][ T3480]  ? ksys_ioctl+0x5e/0xb0
       [   50.540068][ T3480]  __x64_sys_sendto+0x2a/0x30
       [   50.542810][ T3480]  do_syscall_64+0x73/0x1f0
       [   50.545383][ T3480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
       [   50.548477][ T3480] RIP: 0033:0x7f35357d6fb3
       [   50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24
       [   50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
       [   50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3
       [   50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003
       [   50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014
       [   50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082
       [   50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003
       [   50.578170][ T3480] Modules linked in: sch_ingress virtio_net
       [   50.580976][ T3480] ---[ end trace 61a515626a595af6 ]---
      
      CC: Yotam Gigi <yotamg@mellanox.com>
      CC: Jiri Pirko <jiri@mellanox.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Simon Horman <simon.horman@netronome.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Fixes: 6ae0a628 ("net: Introduce psample, a new genetlink channel for packet sampling")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f8bd02b
    • S
      xfrm: Fix memleak on xfrm state destroy · 55b5cbaa
      Steffen Klassert 提交于
      commit 86c6739eda7d2a03f2db30cbee67a5fb81afa8ba upstream.
      
      We leak the page that we use to create skb page fragments
      when destroying the xfrm_state. Fix this by dropping a
      page reference if a page was assigned to the xfrm_state.
      
      Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible")
      Reported-by: NJD <jdtxs00@gmail.com>
      Reported-by: NPaul Wouters <paul@nohats.ca>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55b5cbaa