1. 14 4月, 2022 1 次提交
  2. 12 4月, 2022 39 次提交
    • I
      Revert "module, async: async_synchronize_full() on module init iff async is used" · dc8da50c
      Igor Pylypiv 提交于
      stable inclusion
      from linux-4.19.231
      commit a0c66ac8b72f816d5631fde0ca0b39af602dce48
      
      --------------------------------
      
      [ Upstream commit 67d6212a ]
      
      This reverts commit 774a1221.
      
      We need to finish all async code before the module init sequence is
      done.  In the reverted commit the PF_USED_ASYNC flag was added to mark a
      thread that called async_schedule().  Then the PF_USED_ASYNC flag was
      used to determine whether or not async_synchronize_full() needs to be
      invoked.  This works when modprobe thread is calling async_schedule(),
      but it does not work if module dispatches init code to a worker thread
      which then calls async_schedule().
      
      For example, PCI driver probing is invoked from a worker thread based on
      a node where device is attached:
      
      	if (cpu < nr_cpu_ids)
      		error = work_on_cpu(cpu, local_pci_probe, &ddi);
      	else
      		error = local_pci_probe(&ddi);
      
      We end up in a situation where a worker thread gets the PF_USED_ASYNC
      flag set instead of the modprobe thread.  As a result,
      async_synchronize_full() is not invoked and modprobe completes without
      waiting for the async code to finish.
      
      The issue was discovered while loading the pm80xx driver:
      (scsi_mod.scan=async)
      
      modprobe pm80xx                      worker
      ...
        do_init_module()
        ...
          pci_call_probe()
            work_on_cpu(local_pci_probe)
                                           local_pci_probe()
                                             pm8001_pci_probe()
                                               scsi_scan_host()
                                                 async_schedule()
                                                 worker->flags |= PF_USED_ASYNC;
                                           ...
            < return from worker >
        ...
        if (current->flags & PF_USED_ASYNC) <--- false
        	async_synchronize_full();
      
      Commit 21c3c5d2 ("block: don't request module during elevator init")
      fixed the deadlock issue which the reverted commit 774a1221
      ("module, async: async_synchronize_full() on module init iff async is
      used") tried to fix.
      
      Since commit 0fdff3ec ("async, kmod: warn on synchronous
      request_module() from async workers") synchronous module loading from
      async is not allowed.
      
      Given that the original deadlock issue is fixed and it is no longer
      allowed to call synchronous request_module() from async we can remove
      PF_USED_ASYNC flag to make module init consistently invoke
      async_synchronize_full() unless async module probe is requested.
      Signed-off-by: NIgor Pylypiv <ipylypiv@google.com>
      Reviewed-by: NChangyuan Lyu <changyuanl@google.com>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      dc8da50c
    • D
      tty: n_gsm: fix encoding of control signal octet bit DV · be439c06
      daniel.starke@siemens.com 提交于
      stable inclusion
      from linux-4.19.232
      commit 28ca082153794cf5c98e7bb93d7f30f8ba46bec4
      
      --------------------------------
      
      commit 737b0ef3 upstream.
      
      n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
      See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
      The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
      the newer 27.010 here. Chapter 5.4.6.3.7 describes the encoding of the
      control signal octet used by the MSC (modem status command). The same
      encoding is also used in convergence layer type 2 as described in chapter
      5.5.2. Table 7 and 24 both require the DV (data valid) bit to be set 1 for
      outgoing control signal octets sent by the DTE (data terminal equipment),
      i.e. for the initiator side.
      Currently, the DV bit is only set if CD (carrier detect) is on, regardless
      of the side.
      
      This patch fixes this behavior by setting the DV bit on the initiator side
      unconditionally.
      
      Fixes: e1eaea46 ("tty: n_gsm line discipline")
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Starke <daniel.starke@siemens.com>
      Link: https://lore.kernel.org/r/20220218073123.2121-1-daniel.starke@siemens.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      be439c06
    • L
      fget: clarify and improve __fget_files() implementation · 8e227868
      Linus Torvalds 提交于
      stable inclusion
      from linux-4.19.232
      commit 400c2f361c25bc092d0636cfa32d0549a181e653
      
      --------------------------------
      
      commit e386dfc5 upstream.
      
      Commit 054aa8d4 ("fget: check that the fd still exists after getting
      a ref to it") fixed a race with getting a reference to a file just as it
      was being closed.  It was a fairly minimal patch, and I didn't think
      re-checking the file pointer lookup would be a measurable overhead,
      since it was all right there and cached.
      
      But I was wrong, as pointed out by the kernel test robot.
      
      The 'poll2' case of the will-it-scale.per_thread_ops benchmark regressed
      quite noticeably.  Admittedly it seems to be a very artificial test:
      doing "poll()" system calls on regular files in a very tight loop in
      multiple threads.
      
      That means that basically all the time is spent just looking up file
      descriptors without ever doing anything useful with them (not that doing
      'poll()' on a regular file is useful to begin with).  And as a result it
      shows the extra "re-check fd" cost as a sore thumb.
      
      Happily, the regression is fixable by just writing the code to loook up
      the fd to be better and clearer.  There's still a cost to verify the
      file pointer, but now it's basically in the noise even for that
      benchmark that does nothing else - and the code is more understandable
      and has better comments too.
      
      [ Side note: this patch is also a classic case of one that looks very
        messy with the default greedy Myers diff - it's much more legible with
        either the patience of histogram diff algorithm ]
      
      Link: https://lore.kernel.org/lkml/20211210053743.GA36420@xsang-OptiPlex-9020/
      Link: https://lore.kernel.org/lkml/20211213083154.GA20853@linux.intel.com/Reported-by: Nkernel test robot <oliver.sang@intel.com>
      Tested-by: NCarel Si <beibei.si@intel.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBaokun Li <libaokun1@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      8e227868
    • M
      memblock: use kfree() to release kmalloced memblock regions · 7287cbc1
      Miaohe Lin 提交于
      stable inclusion
      from linux-4.19.232
      commit a6fcced73d15ab57cd97813999edf6f19d3032f8
      
      --------------------------------
      
      commit c94afc46 upstream.
      
      memblock.{reserved,memory}.regions may be allocated using kmalloc() in
      memblock_double_array(). Use kfree() to release these kmalloced regions
      indicated by memblock_{reserved,memory}_in_slab.
      Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
      Fixes: 3010f876 ("mm: discard memblock data later")
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      7287cbc1
    • D
      tty: n_gsm: fix proper link termination after failed open · 99c367d7
      daniel.starke@siemens.com 提交于
      stable inclusion
      from linux-4.19.232
      commit 337e49675ce55c23a50b92aae889ec5d910d6dc7
      
      --------------------------------
      
      commit e3b7468f upstream.
      
      Trying to open a DLCI by sending a SABM frame may fail with a timeout.
      The link is closed on the initiator side without informing the responder
      about this event. The responder assumes the link is open after sending a
      UA frame to answer the SABM frame. The link gets stuck in a half open
      state.
      
      This patch fixes this by initiating the proper link termination procedure
      after link setup timeout instead of silently closing it down.
      
      Fixes: e1eaea46 ("tty: n_gsm line discipline")
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Starke <daniel.starke@siemens.com>
      Link: https://lore.kernel.org/r/20220218073123.2121-3-daniel.starke@siemens.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      99c367d7
    • T
      gso: do not skip outer ip header in case of ipip and net_failover · df2f0cf3
      Tao Liu 提交于
      stable inclusion
      from linux-4.19.232
      commit e9ffbe63f6f32f526a461756309b61c395168d73
      
      --------------------------------
      
      commit cc20cced upstream.
      
      We encounter a tcp drop issue in our cloud environment. Packet GROed in
      host forwards to a VM virtio_net nic with net_failover enabled. VM acts
      as a IPVS LB with ipip encapsulation. The full path like:
      host gro -> vm virtio_net rx -> net_failover rx -> ipvs fullnat
       -> ipip encap -> net_failover tx -> virtio_net tx
      
      When net_failover transmits a ipip pkt (gso_type = 0x0103, which means
      SKB_GSO_TCPV4, SKB_GSO_DODGY and SKB_GSO_IPXIP4), there is no gso
      did because it supports TSO and GSO_IPXIP4. But network_header points to
      inner ip header.
      
      Call Trace:
       tcp4_gso_segment        ------> return NULL
       inet_gso_segment        ------> inner iph, network_header points to
       ipip_gso_segment
       inet_gso_segment        ------> outer iph
       skb_mac_gso_segment
      
      Afterwards virtio_net transmits the pkt, only inner ip header is modified.
      And the outer one just keeps unchanged. The pkt will be dropped in remote
      host.
      
      Call Trace:
       inet_gso_segment        ------> inner iph, outer iph is skipped
       skb_mac_gso_segment
       __skb_gso_segment
       validate_xmit_skb
       validate_xmit_skb_list
       sch_direct_xmit
       __qdisc_run
       __dev_queue_xmit        ------> virtio_net
       dev_hard_start_xmit
       __dev_queue_xmit        ------> net_failover
       ip_finish_output2
       ip_output
       iptunnel_xmit
       ip_tunnel_xmit
       ipip_tunnel_xmit        ------> ipip
       dev_hard_start_xmit
       __dev_queue_xmit
       ip_finish_output2
       ip_output
       ip_forward
       ip_rcv
       __netif_receive_skb_one_core
       netif_receive_skb_internal
       napi_gro_receive
       receive_buf
       virtnet_poll
       net_rx_action
      
      The root cause of this issue is specific with the rare combination of
      SKB_GSO_DODGY and a tunnel device that adds an SKB_GSO_ tunnel option.
      SKB_GSO_DODGY is set from external virtio_net. We need to reset network
      header when callbacks.gso_segment() returns NULL.
      
      This patch also includes ipv6_gso_segment(), considering SIT, etc.
      
      Fixes: cb32f511 ("ipip: add GSO/TSO support")
      Signed-off-by: NTao Liu <thomas.liu@ucloud.cn>
      Reviewed-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      df2f0cf3
    • E
      net: __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends · 220832c5
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.232
      commit 1f4ae0f158dafa74133108bfa07b8053eb2a7898
      
      --------------------------------
      
      commit ef527f96 upstream.
      
      Whenever one of these functions pull all data from an skb in a frag_list,
      use consume_skb() instead of kfree_skb() to avoid polluting drop
      monitoring.
      
      Fixes: 6fa01ccd ("skbuff: Add pskb_extract() helper function")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220220154052.1308469-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      220832c5
    • Z
      cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug · 6d17be8f
      Zhang Qiao 提交于
      stable inclusion
      from linux-4.19.232
      commit 4eec5fe1c680a6c47a9bc0cde00960a4eb663342
      
      --------------------------------
      
      commit 05c7b7a9 upstream.
      
      As previously discussed(https://lkml.org/lkml/2022/1/20/51),
      cpuset_attach() is affected with similar cpu hotplug race,
      as follow scenario:
      
           cpuset_attach()				cpu hotplug
          ---------------------------            ----------------------
          down_write(cpuset_rwsem)
          guarantee_online_cpus() // (load cpus_attach)
      					sched_cpu_deactivate
      					  set_cpu_active()
      					  // will change cpu_active_mask
          set_cpus_allowed_ptr(cpus_attach)
            __set_cpus_allowed_ptr_locked()
             // (if the intersection of cpus_attach and
               cpu_active_mask is empty, will return -EINVAL)
          up_write(cpuset_rwsem)
      
      To avoid races such as described above, protect cpuset_attach() call
      with cpu_hotplug_lock.
      
      Fixes: be367d09 ("cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time")
      Cc: stable@vger.kernel.org # v2.6.32+
      Reported-by: NZhao Gongyi <zhaogongyi@huawei.com>
      Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com>
      Acked-by: NWaiman Long <longman@redhat.com>
      Reviewed-by: NMichal Koutný <mkoutny@suse.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      6d17be8f
    • J
      tracing: Fix tp_printk option related with tp_printk_stop_on_boot · 8dff553c
      JaeSang Yoo 提交于
      stable inclusion
      from linux-4.19.231
      commit 8f8c9e71e192823e4d76fdc53b4391642291bafc
      
      --------------------------------
      
      [ Upstream commit 3203ce39 ]
      
      The kernel parameter "tp_printk_stop_on_boot" starts with "tp_printk" which is
      the same as another kernel parameter "tp_printk". If "tp_printk" setup is
      called before the "tp_printk_stop_on_boot", it will override the latter
      and keep it from being set.
      
      This is similar to other kernel parameter issues, such as:
        Commit 745a600c ("um: console: Ignore console= option")
      or init/do_mounts.c:45 (setup function of "ro" kernel param)
      
      Fix it by checking for a "_" right after the "tp_printk" and if that
      exists do not process the parameter.
      
      Link: https://lkml.kernel.org/r/20220208195421.969326-1-jsyoo5b@gmail.comSigned-off-by: NJaeSang Yoo <jsyoo5b@gmail.com>
      [ Fixed up change log and added space after if condition ]
      Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      8dff553c
    • J
      dmaengine: sh: rcar-dmac: Check for error num after setting mask · 397e43da
      Jiasheng Jiang 提交于
      stable inclusion
      from linux-4.19.231
      commit 783d70c94e513ca715643c00c6c275d9eb3b1a9e
      
      --------------------------------
      
      commit 2d21543e upstream.
      
      Because of the possible failure of the dma_supported(), the
      dma_set_mask_and_coherent() may return error num.
      Therefore, it should be better to check it and return the error if
      fails.
      
      Fixes: dc312349 ("dmaengine: rcar-dmac: Widen DMA mask to 40 bits")
      Signed-off-by: NJiasheng Jiang <jiasheng@iscas.ac.cn>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Link: https://lore.kernel.org/r/20220106030939.2644320-1-jiasheng@iscas.ac.cnSigned-off-by: NVinod Koul <vkoul@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      397e43da
    • E
      net: sched: limit TC_ACT_REPEAT loops · b4a48e80
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.231
      commit f63c9fa36bd7548fd07792127057cccb68a7e274
      
      --------------------------------
      
      commit 5740d068 upstream.
      
      We have been living dangerously, at the mercy of malicious users,
      abusing TC_ACT_REPEAT, as shown by this syzpot report [1].
      
      Add an arbitrary limit (32) to the number of times an action can
      return TC_ACT_REPEAT.
      
      v2: switch the limit to 32 instead of 10.
          Use net_warn_ratelimited() instead of pr_err_once().
      
      [1] (C repro available on demand)
      
      rcu: INFO: rcu_preempt self-detected stall on CPU
      rcu:    1-...!: (10500 ticks this GP) idle=021/1/0x4000000000000000 softirq=5592/5592 fqs=0
              (t=10502 jiffies g=5305 q=190)
      rcu: rcu_preempt kthread timer wakeup didn't happen for 10502 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
      rcu:    Possible timer handling issue on cpu=0 timer-softirq=3527
      rcu: rcu_preempt kthread starved for 10505 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
      rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
      rcu: RCU grace-period kthread stack dump:
      task:rcu_preempt     state:I stack:29344 pid:   14 ppid:     2 flags:0x00004000
      Call Trace:
       <TASK>
       context_switch kernel/sched/core.c:4986 [inline]
       __schedule+0xab2/0x4db0 kernel/sched/core.c:6295
       schedule+0xd2/0x260 kernel/sched/core.c:6368
       schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881
       rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1963
       rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2136
       kthread+0x2e9/0x3a0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      rcu: Stack dump where RCU GP kthread last ran:
      Sending NMI from CPU 1 to CPUs 0:
      NMI backtrace for cpu 0
      CPU: 0 PID: 3646 Comm: syz-executor358 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline]
      RIP: 0010:cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline]
      RIP: 0010:pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:437 [inline]
      RIP: 0010:__pv_queued_spin_lock_slowpath+0x3b8/0xb40 kernel/locking/qspinlock.c:508
      Code: 48 89 eb c6 45 01 01 41 bc 00 80 00 00 48 c1 e9 03 83 e3 07 41 be 01 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 2c 01 eb 0c <f3> 90 41 83 ec 01 0f 84 72 04 00 00 41 0f b6 45 00 38 d8 7f 08 84
      RSP: 0018:ffffc9000283f1b0 EFLAGS: 00000206
      RAX: 0000000000000003 RBX: 0000000000000000 RCX: 1ffff1100fc0071e
      RDX: 0000000000000001 RSI: 0000000000000201 RDI: 0000000000000000
      RBP: ffff88807e0038f0 R08: 0000000000000001 R09: ffffffff8ffbf9ff
      R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000004c1e
      R13: ffffed100fc0071e R14: 0000000000000001 R15: ffff8880b9c3aa80
      FS:  00005555562bf300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffdbfef12b8 CR3: 00000000723c2000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
       queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
       queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
       do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:115
       spin_lock_bh include/linux/spinlock.h:354 [inline]
       sch_tree_lock include/net/sch_generic.h:610 [inline]
       sch_tree_lock include/net/sch_generic.h:605 [inline]
       prio_tune+0x3b9/0xb50 net/sched/sch_prio.c:211
       prio_init+0x5c/0x80 net/sched/sch_prio.c:244
       qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253
       tc_modify_qdisc+0x4c5/0x1980 net/sched/sch_api.c:1660
       rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5594
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:725
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f7ee98aae99
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffdbfef12d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007ffdbfef1300 RCX: 00007f7ee98aae99
      RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
      R10: 000000000000000d R11: 0000000000000246 R12: 00007ffdbfef12f0
      R13: 00000000000f4240 R14: 000000000004ca47 R15: 00007ffdbfef12e4
       </TASK>
      INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.293 msecs
      NMI backtrace for cpu 1
      CPU: 1 PID: 3260 Comm: kworker/1:3 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: mld mld_ifc_work
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
       nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
       trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
       rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343
       print_cpu_stall kernel/rcu/tree_stall.h:604 [inline]
       check_cpu_stall kernel/rcu/tree_stall.h:688 [inline]
       rcu_pending kernel/rcu/tree.c:3919 [inline]
       rcu_sched_clock_irq.cold+0x5c/0x759 kernel/rcu/tree.c:2617
       update_process_times+0x16d/0x200 kernel/time/timer.c:1785
       tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
       tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428
       __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
       __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
       hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
       local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
       __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
       sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
       </IRQ>
       <TASK>
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
      RIP: 0010:__sanitizer_cov_trace_const_cmp4+0xc/0x70 kernel/kcov.c:286
      Code: 00 00 00 48 89 7c 30 e8 48 89 4c 30 f0 4c 89 54 d8 20 48 89 10 5b c3 0f 1f 80 00 00 00 00 41 89 f8 bf 03 00 00 00 4c 8b 14 24 <89> f1 65 48 8b 34 25 00 70 02 00 e8 14 f9 ff ff 84 c0 74 4b 48 8b
      RSP: 0018:ffffc90002c5eea8 EFLAGS: 00000246
      RAX: 0000000000000007 RBX: ffff88801c625800 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: ffff8880137d3100 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff874fcd88 R11: 0000000000000000 R12: ffff88801d692dc0
      R13: ffff8880137d3104 R14: 0000000000000000 R15: ffff88801d692de8
       tcf_police_act+0x358/0x11d0 net/sched/act_police.c:256
       tcf_action_exec net/sched/act_api.c:1049 [inline]
       tcf_action_exec+0x1a6/0x530 net/sched/act_api.c:1026
       tcf_exts_exec include/net/pkt_cls.h:326 [inline]
       route4_classify+0xef0/0x1400 net/sched/cls_route.c:179
       __tcf_classify net/sched/cls_api.c:1549 [inline]
       tcf_classify+0x3e8/0x9d0 net/sched/cls_api.c:1615
       prio_classify net/sched/sch_prio.c:42 [inline]
       prio_enqueue+0x3a7/0x790 net/sched/sch_prio.c:75
       dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3668
       __dev_xmit_skb net/core/dev.c:3756 [inline]
       __dev_queue_xmit+0x1f61/0x3660 net/core/dev.c:4081
       neigh_hh_output include/net/neighbour.h:533 [inline]
       neigh_output include/net/neighbour.h:547 [inline]
       ip_finish_output2+0x14dc/0x2170 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:306 [inline]
       __ip_finish_output+0x396/0x650 net/ipv4/ip_output.c:288
       ip_finish_output+0x32/0x200 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip_output+0x196/0x310 net/ipv4/ip_output.c:430
       dst_output include/net/dst.h:451 [inline]
       ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:126
       iptunnel_xmit+0x628/0xa50 net/ipv4/ip_tunnel_core.c:82
       geneve_xmit_skb drivers/net/geneve.c:966 [inline]
       geneve_xmit+0x10c8/0x3530 drivers/net/geneve.c:1077
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one net/core/dev.c:3473 [inline]
       dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3489
       __dev_queue_xmit+0x2985/0x3660 net/core/dev.c:4116
       neigh_hh_output include/net/neighbour.h:533 [inline]
       neigh_output include/net/neighbour.h:547 [inline]
       ip6_finish_output2+0xf7a/0x14f0 net/ipv6/ip6_output.c:126
       __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
       __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170
       ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
       dst_output include/net/dst.h:451 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       mld_sendpack+0x9a3/0xe40 net/ipv6/mcast.c:1826
       mld_send_cr net/ipv6/mcast.c:2127 [inline]
       mld_ifc_work+0x71c/0xdc0 net/ipv6/mcast.c:2659
       process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
       worker_thread+0x657/0x1110 kernel/workqueue.c:2454
       kthread+0x2e9/0x3a0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      ----------------
      Code disassembly (best guess):
         0:   48 89 eb                mov    %rbp,%rbx
         3:   c6 45 01 01             movb   $0x1,0x1(%rbp)
         7:   41 bc 00 80 00 00       mov    $0x8000,%r12d
         d:   48 c1 e9 03             shr    $0x3,%rcx
        11:   83 e3 07                and    $0x7,%ebx
        14:   41 be 01 00 00 00       mov    $0x1,%r14d
        1a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
        21:   fc ff df
        24:   4c 8d 2c 01             lea    (%rcx,%rax,1),%r13
        28:   eb 0c                   jmp    0x36
      * 2a:   f3 90                   pause <-- trapping instruction
        2c:   41 83 ec 01             sub    $0x1,%r12d
        30:   0f 84 72 04 00 00       je     0x4a8
        36:   41 0f b6 45 00          movzbl 0x0(%r13),%eax
        3b:   38 d8                   cmp    %bl,%al
        3d:   7f 08                   jg     0x47
        3f:   84                      .byte 0x84
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20220215235305.3272331-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      b4a48e80
    • B
      mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe() · 6ff4cceb
      Bryan O'Donoghue 提交于
      stable inclusion
      from linux-4.19.231
      commit 891484112f4ffd0b576e88407bbd3653abf3faba
      
      --------------------------------
      
      commit 5c23b3f9 upstream.
      
      Interacting with a NAND chip on an IPQ6018 I found that the qcomsmem NAND
      partition parser was returning -EPROBE_DEFER waiting for the main smem
      driver to load.
      
      This caused the board to reset. Playing about with the probe() function
      shows that the problem lies in the core clock being switched off before the
      nandc_unalloc() routine has completed.
      
      If we look at how qcom_nandc_remove() tears down allocated resources we see
      the expected order is
      
      qcom_nandc_unalloc(nandc);
      
      clk_disable_unprepare(nandc->aon_clk);
      clk_disable_unprepare(nandc->core_clk);
      
      dma_unmap_resource(&pdev->dev, nandc->base_dma, resource_size(res),
      		   DMA_BIDIRECTIONAL, 0);
      
      Tweaking probe() to both bring up and tear-down in that order removes the
      reset if we end up deferring elsewhere.
      
      Fixes: c76b78d8 ("mtd: nand: Qualcomm NAND controller driver")
      Signed-off-by: NBryan O'Donoghue <bryan.odonoghue@linaro.org>
      Reviewed-by: NManivannan Sadhasivam <mani@kernel.org>
      Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220103030316.58301-2-bryan.odonoghue@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      6ff4cceb
    • T
      NFS: Do not report writeback errors in nfs_getattr() · e488b678
      Trond Myklebust 提交于
      stable inclusion
      from linux-4.19.231
      commit d2ba21f271eb167ac2b0581598e55222b89f0f32
      
      --------------------------------
      
      commit d19e0183 upstream.
      
      The result of the writeback, whether it is an ENOSPC or an EIO, or
      anything else, does not inhibit the NFS client from reporting the
      correct file timestamps.
      
      Fixes: 79566ef0 ("NFS: Getattr doesn't require data sync semantics")
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      e488b678
    • T
      NFS: LOOKUP_DIRECTORY is also ok with symlinks · 38677870
      Trond Myklebust 提交于
      stable inclusion
      from linux-4.19.231
      commit 67552482aee7a139ed957595db8668d28ddc2c42
      
      --------------------------------
      
      commit e0caaf75 upstream.
      
      Commit ac795161 (NFSv4: Handle case where the lookup of a directory
      fails) [1], part of Linux since 5.17-rc2, introduced a regression, where
      a symbolic link on an NFS mount to a directory on another NFS does not
      resolve(?) the first time it is accessed:
      Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de>
      Fixes: ac795161 ("NFSv4: Handle case where the lookup of a directory fails")
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Tested-by: NDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      38677870
    • E
      bonding: fix data-races around agg_select_timer · 68f74ceb
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.231
      commit 4218e6995c19970aa7b32914be6c8e059837cbdf
      
      --------------------------------
      
      commit 9ceaf6f7 upstream.
      
      syzbot reported that two threads might write over agg_select_timer
      at the same time. Make agg_select_timer atomic to fix the races.
      
      BUG: KCSAN: data-race in bond_3ad_initiate_agg_selection / bond_3ad_state_machine_handler
      
      read to 0xffff8881242aea90 of 4 bytes by task 1846 on cpu 1:
       bond_3ad_state_machine_handler+0x99/0x2810 drivers/net/bonding/bond_3ad.c:2317
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      write to 0xffff8881242aea90 of 4 bytes by task 25910 on cpu 0:
       bond_3ad_initiate_agg_selection+0x18/0x30 drivers/net/bonding/bond_3ad.c:1998
       bond_open+0x658/0x6f0 drivers/net/bonding/bond_main.c:3967
       __dev_open+0x274/0x3a0 net/core/dev.c:1407
       dev_open+0x54/0x190 net/core/dev.c:1443
       bond_enslave+0xcef/0x3000 drivers/net/bonding/bond_main.c:1937
       do_set_master net/core/rtnetlink.c:2532 [inline]
       do_setlink+0x94f/0x2500 net/core/rtnetlink.c:2736
       __rtnl_newlink net/core/rtnetlink.c:3414 [inline]
       rtnl_newlink+0xfeb/0x13e0 net/core/rtnetlink.c:3529
       rtnetlink_rcv_msg+0x745/0x7e0 net/core/rtnetlink.c:5594
       netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2494
       rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5612
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x602/0x6d0 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0x728/0x850 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg net/socket.c:725 [inline]
       ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
       ___sys_sendmsg net/socket.c:2467 [inline]
       __sys_sendmsg+0x195/0x230 net/socket.c:2496
       __do_sys_sendmsg net/socket.c:2505 [inline]
       __se_sys_sendmsg net/socket.c:2503 [inline]
       __x64_sys_sendmsg+0x42/0x50 net/socket.c:2503
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x00000050 -> 0x0000004f
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 25910 Comm: syz-executor.1 Tainted: G        W         5.17.0-rc4-syzkaller-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      68f74ceb
    • E
      drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit · 452fc6b6
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.231
      commit e294bc65746b779e4ad50f95ed04926cf72d1454
      
      --------------------------------
      
      commit dcd54265 upstream.
      
      trace_napi_poll_hit() is reading stat->dev while another thread can write
      on it from dropmon_net_event()
      
      Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already,
      we only have to take care of load/store tearing.
      
      BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit
      
      write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1:
       dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579
       notifier_call_chain kernel/notifier.c:84 [inline]
       raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392
       call_netdevice_notifiers_info net/core/dev.c:1919 [inline]
       call_netdevice_notifiers_extack net/core/dev.c:1931 [inline]
       call_netdevice_notifiers net/core/dev.c:1945 [inline]
       unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415
       ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123
       vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515
       ops_exit_list net/core/net_namespace.c:173 [inline]
       cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0:
       trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292
       trace_napi_poll include/trace/events/napi.h:14 [inline]
       __napi_poll+0x36b/0x3f0 net/core/dev.c:6366
       napi_poll net/core/dev.c:6432 [inline]
       net_rx_action+0x29e/0x650 net/core/dev.c:6519
       __do_softirq+0x158/0x2de kernel/softirq.c:558
       do_softirq+0xb1/0xf0 kernel/softirq.c:459
       __local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383
       __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
       _raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210
       spin_unlock_bh include/linux/spinlock.h:394 [inline]
       ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
       wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      value changed: 0xffff88815883e000 -> 0x0000000000000000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker
      
      Fixes: 4ea7e386 ("dropmon: add ability to detect when hardware dropsrxpackets")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      452fc6b6
    • X
      ping: fix the dif and sdif check in ping_lookup · 8f077184
      Xin Long 提交于
      stable inclusion
      from linux-4.19.231
      commit b0e55a57df7deed5f0682b584e4da913db2f6e84
      
      --------------------------------
      
      commit 35a79e64 upstream.
      
      When 'ping' changes to use PING socket instead of RAW socket by:
      
         # sysctl -w net.ipv4.ping_group_range="0 100"
      
      There is another regression caused when matching sk_bound_dev_if
      and dif, RAW socket is using inet_iif() while PING socket lookup
      is using skb->dev->ifindex, the cmd below fails due to this:
      
        # ip link add dummy0 type dummy
        # ip link set dummy0 up
        # ip addr add 192.168.111.1/24 dev dummy0
        # ping -I dummy0 192.168.111.1 -c1
      
      The issue was also reported on:
      
        https://github.com/iputils/iputils/issues/104
      
      But fixed in iputils in a wrong way by not binding to device when
      destination IP is on device, and it will cause some of kselftests
      to fail, as Jianlin noticed.
      
      This patch is to use inet(6)_iif and inet(6)_sdif to get dif and
      sdif for PING socket, and keep consistent with RAW socket.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      8f077184
    • E
      taskstats: Cleanup the use of task->exit_code · b67cb551
      Eric W. Biederman 提交于
      stable inclusion
      from linux-4.19.231
      commit c6055df602b9e9ed9427b7bb9088c75cb5cf6350
      
      --------------------------------
      
      commit 1b5a42d9 upstream.
      
      In the function bacct_add_task the code reading task->exit_code was
      introduced in commit f3cef7a9 ("[PATCH] csa: basic accounting over
      taskstats"), and it is not entirely clear what the taskstats interface
      is trying to return as only returning the exit_code of the first task
      in a process doesn't make a lot of sense.
      
      As best as I can figure the intent is to return task->exit_code after
      a task exits.  The field is returned with per task fields, so the
      exit_code of the entire process is not wanted.  Only the value of the
      first task is returned so this is not a useful way to get the per task
      ptrace stop code.  The ordinary case of returning this value is
      returning after a task exits, which also precludes use for getting
      a ptrace value.
      
      It is common to for the first task of a process to also be the last
      task of a process so this field may have done something reasonable by
      accident in testing.
      
      Make ac_exitcode a reliable per task value by always returning it for
      every exited task.
      
      Setting ac_exitcode in a sensible mannter makes it possible to continue
      to provide this value going forward.
      
      Cc: Balbir Singh <bsingharora@gmail.com>
      Fixes: f3cef7a9 ("[PATCH] csa: basic accounting over taskstats")
      Link: https://lkml.kernel.org/r/20220103213312.9144-5-ebiederm@xmission.comSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      [sudip: adjust context]
      Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      b67cb551
    • G
      xfrm: Don't accidentally set RTO_ONLINK in decode_session4() · 9f4c7283
      Guillaume Nault 提交于
      stable inclusion
      from linux-4.19.231
      commit bbf7a1a2fc64d89896ba5eba494a40ca151a2675
      
      --------------------------------
      
      commit 23e7b1bf upstream.
      
      Similar to commit 94e22389 ("xfrm4: strip ECN bits from tos field"),
      clear the ECN bits from iph->tos when setting ->flowi4_tos.
      This ensures that the last bit of ->flowi4_tos is cleared, so
      ip_route_output_key_hash() isn't going to restrict the scope of the
      route lookup.
      
      Use ~INET_ECN_MASK instead of IPTOS_RT_MASK, because we have no reason
      to clear the high order bits.
      
      Found by code inspection, compile tested only.
      
      Fixes: 4da3089f ("[IPSEC]: Use TOS when doing tunnel lookups")
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      [sudip: manually backport to previous location]
      Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      9f4c7283
    • S
      nvme: fix a possible use-after-free in controller reset during load · 0173888a
      Sagi Grimberg 提交于
      stable inclusion
      from linux-4.19.231
      commit a25e460fbb0340488d119fb2e28fe3f829b7417e
      
      --------------------------------
      
      [ Upstream commit 0fa0f99f ]
      
      Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
      readiness for AER submission. This may lead to a use-after-free
      condition that was observed with nvme-tcp.
      
      The race condition may happen in the following scenario:
      1. driver executes its reset_ctrl_work
      2. -> nvme_stop_ctrl - flushes ctrl async_event_work
      3. ctrl sends AEN which is received by the host, which in turn
         schedules AEN handling
      4. teardown admin queue (which releases the queue socket)
      5. AEN processed, submits another AER, calling the driver to submit
      6. driver attempts to send the cmd
      ==> use-after-free
      
      In order to fix that, add ctrl state check to validate the ctrl
      is actually able to accept the AER submission.
      
      This addresses the above race in controller resets because the driver
      during teardown should:
      1. change ctrl state to RESETTING
      2. flush async_event_work (as well as other async work elements)
      
      So after 1,2, any other AER command will find the
      ctrl state to be RESETTING and bail out without submitting the AER.
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      0173888a
    • D
      quota: make dquot_quota_sync return errors from ->sync_fs · 93ed5e1c
      Darrick J. Wong 提交于
      stable inclusion
      from linux-4.19.231
      commit a1a41571f06e2b66229738bd0c92c1d57dd793e2
      
      --------------------------------
      
      [ Upstream commit dd5532a4 ]
      
      Strangely, dquot_quota_sync ignores the return code from the ->sync_fs
      call, which means that quotacalls like Q_SYNC never see the error.  This
      doesn't seem right, so fix that.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NChristian Brauner <brauner@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      93ed5e1c
    • D
      vfs: make freeze_super abort when sync_filesystem returns error · 976db2f4
      Darrick J. Wong 提交于
      stable inclusion
      from linux-4.19.231
      commit d5c33270b8b2c274eb8df7b92d166166102528c6
      
      --------------------------------
      
      [ Upstream commit 2719c716 ]
      
      If we fail to synchronize the filesystem while preparing to freeze the
      fs, abort the freeze.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NChristian Brauner <brauner@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      976db2f4
    • R
      serial: parisc: GSC: fix build when IOSAPIC is not set · fa03a2a6
      Randy Dunlap 提交于
      stable inclusion
      from linux-4.19.231
      commit 592b1b5ad3ee6cce1381111cfd69c6dc868050a3
      
      --------------------------------
      
      commit 6e879367 upstream.
      
      There is a build error when using a kernel .config file from
      'kernel test robot' for a different build problem:
      
      hppa64-linux-ld: drivers/tty/serial/8250/8250_gsc.o: in function `.LC3':
      (.data.rel.ro+0x18): undefined reference to `iosapic_serial_irq'
      
      when:
        CONFIG_GSC=y
        CONFIG_SERIO_GSCPS2=y
        CONFIG_SERIAL_8250_GSC=y
        CONFIG_PCI is not set
          and hence PCI_LBA is not set.
        IOSAPIC depends on PCI_LBA, so IOSAPIC is not set/enabled.
      
      Make the use of iosapic_serial_irq() conditional to fix the build error.
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Reported-by: Nkernel test robot <lkp@intel.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-serial@vger.kernel.org
      Cc: Jiri Slaby <jirislaby@kernel.org>
      Cc: Johan Hovold <johan@kernel.org>
      Suggested-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      fa03a2a6
    • S
      perf: Fix list corruption in perf_cgroup_switch() · 90dd4e1f
      Song Liu 提交于
      stable inclusion
      from linux-4.19.230
      commit 30d9f3cbe47e1018ddc8069ac5b5c9e66fbdf727
      
      --------------------------------
      
      commit 5f4e5ce6 upstream.
      
      There's list corruption on cgrp_cpuctx_list. This happens on the
      following path:
      
        perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
            cpu_ctx_sched_in
               ctx_sched_in
                  ctx_pinned_sched_in
                    merge_sched_in
                        perf_cgroup_event_disable: remove the event from the list
      
      Use list_for_each_entry_safe() to allow removing an entry during
      iteration.
      
      Fixes: 058fe1c0 ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events")
      Signed-off-by: NSong Liu <song@kernel.org>
      Reviewed-by: NRik van Riel <riel@surriel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20220204004057.2961252-1-song@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      90dd4e1f
    • K
      seccomp: Invalidate seccomp mode to catch death failures · e5d58f5e
      Kees Cook 提交于
      stable inclusion
      from linux-4.19.230
      commit 255264d81da6edaf4cd4fab836d1ef3ba09af6aa
      
      --------------------------------
      
      commit 495ac306 upstream.
      
      If seccomp tries to kill a process, it should never see that process
      again. To enforce this proactively, switch the mode to something
      impossible. If encountered: WARN, reject all syscalls, and attempt to
      kill the process again even harder.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Fixes: 8112c4f1 ("seccomp: remove 2-phase API")
      Cc: stable@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      e5d58f5e
    • T
      n_tty: wake up poll(POLLRDNORM) on receiving data · b3195d23
      TATSUKAWA KOSUKE (立川 江介) 提交于
      stable inclusion
      from linux-4.19.230
      commit 031ec2d7fabb5c01601490671666c55cf92e1b98
      
      --------------------------------
      
      commit c816b2e6 upstream.
      
      The poll man page says POLLRDNORM is equivalent to POLLIN when used as
      an event.
      $ man poll
      <snip>
                    POLLRDNORM
                           Equivalent to POLLIN.
      
      However, in n_tty driver, POLLRDNORM does not return until timeout even
      if there is terminal input, whereas POLLIN returns.
      
      The following test program works until kernel-3.17, but the test stops
      in poll() after commit 57087d51 ("tty: Fix spurious poll() wakeups").
      
      [Steps to run test program]
        $ cc -o test-pollrdnorm test-pollrdnorm.c
        $ ./test-pollrdnorm
        foo          <-- Type in something from the terminal followed by [RET].
                         The string should be echoed back.
      
        ------------------------< test-pollrdnorm.c >------------------------
        #include <stdio.h>
        #include <errno.h>
        #include <poll.h>
        #include <unistd.h>
      
        void main(void)
        {
      	int		n;
      	unsigned char	buf[8];
      	struct pollfd	fds[1] = {{ 0, POLLRDNORM, 0 }};
      
      	n = poll(fds, 1, -1);
      	if (n < 0)
      		perror("poll");
      	n = read(0, buf, 8);
      	if (n < 0)
      		perror("read");
      	if (n > 0)
      		write(1, buf, n);
        }
        ------------------------------------------------------------------------
      
      The attached patch fixes this problem.  Many calls to
      wake_up_interruptible_poll() in the kernel source code already specify
      "POLLIN | POLLRDNORM".
      
      Fixes: 57087d51 ("tty: Fix spurious poll() wakeups")
      Cc: stable@vger.kernel.org
      Signed-off-by: NKosuke Tatsukawa <tatsu-ab1@nec.com>
      Link: https://lore.kernel.org/r/TYCPR01MB81901C0F932203D30E452B3EA5209@TYCPR01MB8190.jpnprd01.prod.outlook.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      b3195d23
    • E
      veth: fix races around rq->rx_notify_masked · c060bda8
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.230
      commit 8f0ea3777590c16222d4251a49e31dd376ff5ac1
      
      --------------------------------
      
      [ Upstream commit 68468d8c ]
      
      veth being NETIF_F_LLTX enabled, we need to be more careful
      whenever we read/write rq->rx_notify_masked.
      
      BUG: KCSAN: data-race in veth_xmit / veth_xmit
      
      write to 0xffff888133d9a9f8 of 1 bytes by task 23552 on cpu 0:
       __veth_xdp_flush drivers/net/veth.c:269 [inline]
       veth_xmit+0x307/0x470 drivers/net/veth.c:350
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one+0x105/0x2f0 net/core/dev.c:3473
       dev_hard_start_xmit net/core/dev.c:3489 [inline]
       __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116
       dev_queue_xmit+0x13/0x20 net/core/dev.c:4149
       br_dev_queue_push_xmit+0x3ce/0x430 net/bridge/br_forward.c:53
       NF_HOOK include/linux/netfilter.h:307 [inline]
       br_forward_finish net/bridge/br_forward.c:66 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       __br_forward+0x2e4/0x400 net/bridge/br_forward.c:115
       br_flood+0x521/0x5c0 net/bridge/br_forward.c:242
       br_dev_xmit+0x8b6/0x960
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one+0x105/0x2f0 net/core/dev.c:3473
       dev_hard_start_xmit net/core/dev.c:3489 [inline]
       __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116
       dev_queue_xmit+0x13/0x20 net/core/dev.c:4149
       neigh_hh_output include/net/neighbour.h:525 [inline]
       neigh_output include/net/neighbour.h:539 [inline]
       ip_finish_output2+0x6f8/0xb70 net/ipv4/ip_output.c:228
       ip_finish_output+0xfb/0x240 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:430
       dst_output include/net/dst.h:451 [inline]
       ip_local_out net/ipv4/ip_output.c:126 [inline]
       ip_send_skb+0x6e/0xe0 net/ipv4/ip_output.c:1570
       udp_send_skb+0x641/0x880 net/ipv4/udp.c:967
       udp_sendmsg+0x12ea/0x14c0 net/ipv4/udp.c:1254
       inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg net/socket.c:725 [inline]
       ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
       ___sys_sendmsg net/socket.c:2467 [inline]
       __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
       __do_sys_sendmmsg net/socket.c:2582 [inline]
       __se_sys_sendmmsg net/socket.c:2579 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff888133d9a9f8 of 1 bytes by task 23563 on cpu 1:
       __veth_xdp_flush drivers/net/veth.c:268 [inline]
       veth_xmit+0x2d6/0x470 drivers/net/veth.c:350
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one+0x105/0x2f0 net/core/dev.c:3473
       dev_hard_start_xmit net/core/dev.c:3489 [inline]
       __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116
       dev_queue_xmit+0x13/0x20 net/core/dev.c:4149
       br_dev_queue_push_xmit+0x3ce/0x430 net/bridge/br_forward.c:53
       NF_HOOK include/linux/netfilter.h:307 [inline]
       br_forward_finish net/bridge/br_forward.c:66 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       __br_forward+0x2e4/0x400 net/bridge/br_forward.c:115
       br_flood+0x521/0x5c0 net/bridge/br_forward.c:242
       br_dev_xmit+0x8b6/0x960
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one+0x105/0x2f0 net/core/dev.c:3473
       dev_hard_start_xmit net/core/dev.c:3489 [inline]
       __dev_queue_xmit+0x86d/0xf90 net/core/dev.c:4116
       dev_queue_xmit+0x13/0x20 net/core/dev.c:4149
       neigh_hh_output include/net/neighbour.h:525 [inline]
       neigh_output include/net/neighbour.h:539 [inline]
       ip_finish_output2+0x6f8/0xb70 net/ipv4/ip_output.c:228
       ip_finish_output+0xfb/0x240 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:430
       dst_output include/net/dst.h:451 [inline]
       ip_local_out net/ipv4/ip_output.c:126 [inline]
       ip_send_skb+0x6e/0xe0 net/ipv4/ip_output.c:1570
       udp_send_skb+0x641/0x880 net/ipv4/udp.c:967
       udp_sendmsg+0x12ea/0x14c0 net/ipv4/udp.c:1254
       inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg net/socket.c:725 [inline]
       ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
       ___sys_sendmsg net/socket.c:2467 [inline]
       __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
       __do_sys_sendmmsg net/socket.c:2582 [inline]
       __se_sys_sendmmsg net/socket.c:2579 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x00 -> 0x01
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 23563 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00064-gc36c04c2 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 948d4f21 ("veth: Add driver XDP")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      c060bda8
    • A
      net: fix a memleak when uncloning an skb dst and its metadata · 0c4296c0
      Antoine Tenart 提交于
      stable inclusion
      from linux-4.19.230
      commit 0be943916d781df2b652793bb2d3ae4f9624c10a
      
      --------------------------------
      
      [ Upstream commit 9eeabdf1 ]
      
      When uncloning an skb dst and its associated metadata, a new
      dst+metadata is allocated and later replaces the old one in the skb.
      This is helpful to have a non-shared dst+metadata attached to a specific
      skb.
      
      The issue is the uncloned dst+metadata is initialized with a refcount of
      1, which is increased to 2 before attaching it to the skb. When
      tun_dst_unclone returns, the dst+metadata is only referenced from a
      single place (the skb) while its refcount is 2. Its refcount will never
      drop to 0 (when the skb is consumed), leading to a memory leak.
      
      Fix this by removing the call to dst_hold in tun_dst_unclone, as the
      dst+metadata refcount is already 1.
      
      Fixes: fc4099f1 ("openvswitch: Fix egress tunnel info.")
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Reported-by: NVlad Buslov <vladbu@nvidia.com>
      Tested-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      0c4296c0
    • A
      net: do not keep the dst cache when uncloning an skb dst and its metadata · ceaed295
      Antoine Tenart 提交于
      stable inclusion
      from linux-4.19.230
      commit 040e92ea3d7d6f27c1b71d6502e35c54a0939cb7
      
      --------------------------------
      
      [ Upstream commit cfc56f85 ]
      
      When uncloning an skb dst and its associated metadata a new dst+metadata
      is allocated and the tunnel information from the old metadata is copied
      over there.
      
      The issue is the tunnel metadata has references to cached dst, which are
      copied along the way. When a dst+metadata refcount drops to 0 the
      metadata is freed including the cached dst entries. As they are also
      referenced in the initial dst+metadata, this ends up in UaFs.
      
      In practice the above did not happen because of another issue, the
      dst+metadata was never freed because its refcount never dropped to 0
      (this will be fixed in a subsequent patch).
      
      Fix this by initializing the dst cache after copying the tunnel
      information from the old metadata to also unshare the dst cache.
      
      Fixes: d71785ff ("net: add dst_cache to ovs vxlan lwtunnel")
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reported-by: NVlad Buslov <vladbu@nvidia.com>
      Tested-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      ceaed295
    • E
      ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path · a7f72733
      Eric Dumazet 提交于
      stable inclusion
      from linux-4.19.230
      commit 12b6703e9546902c56b4b9048b893ad49d62bdd4
      
      --------------------------------
      
      [ Upstream commit 5611a006 ]
      
      ip[6]mr_free_table() can only be called under RTNL lock.
      
      RTNL: assertion failed at net/core/dev.c (10367)
      WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
      Modules linked in:
      CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
      Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
      RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
      R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
      FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
       ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
       ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
       ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
       ip6mr_net_init+0x3f0/0x4e0 net/ipv6/ip6mr.c:1298
       ops_init+0xaf/0x470 net/core/net_namespace.c:140
       setup_net+0x54f/0xbb0 net/core/net_namespace.c:331
       copy_net_ns+0x318/0x760 net/core/net_namespace.c:475
       create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
       copy_namespaces+0x391/0x450 kernel/nsproxy.c:178
       copy_process+0x2e0c/0x7300 kernel/fork.c:2167
       kernel_clone+0xe7/0xab0 kernel/fork.c:2555
       __do_sys_clone+0xc8/0x110 kernel/fork.c:2672
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f4ab89f9059
      Code: Unable to access opcode bytes at RIP 0x7f4ab89f902f.
      RSP: 002b:00007f4ab736e118 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
      RAX: ffffffffffffffda RBX: 00007f4ab8b0bf60 RCX: 00007f4ab89f9059
      RDX: 0000000020000280 RSI: 0000000020000270 RDI: 0000000040200000
      RBP: 00007f4ab8a5308d R08: 0000000020000300 R09: 0000000020000300
      R10: 00000000200002c0 R11: 0000000000000206 R12: 0000000000000000
      R13: 00007ffc3977cc1f R14: 00007f4ab736e300 R15: 0000000000022000
       </TASK>
      
      Fixes: f243e5a7 ("ipmr,ip6mr: call ip6mr_free_table() on failure path")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Cong Wang <cong.wang@bytedance.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20220208053451.2885398-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      a7f72733
    • M
      bonding: pair enable_port with slave_arr_updates · ab97b84d
      Mahesh Bandewar 提交于
      stable inclusion
      from linux-4.19.230
      commit c92b23d934f008aea3ec3239d951f6c4cc0c4422
      
      --------------------------------
      
      [ Upstream commit 23de0d7b ]
      
      When 803.2ad mode enables a participating port, it should update
      the slave-array. I have observed that the member links are participating
      and are part of the active aggregator while the traffic is egressing via
      only one member link (in a case where two links are participating). Via
      kprobes I discovered that slave-arr has only one link added while
      the other participating link wasn't part of the slave-arr.
      
      I couldn't see what caused that situation but the simple code-walk
      through provided me hints that the enable_port wasn't always associated
      with the slave-array update.
      
      Fixes: ee637714 ("bonding: Simplify the xmit function for modes that use xmit_hash")
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/20220207222901.1795287-1-maheshb@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      ab97b84d
    • D
      bpf: Add kconfig knob for disabling unpriv bpf by default · 25c90cb2
      Daniel Borkmann 提交于
      stable inclusion
      from linux-4.19.230
      commit 07e7f7cc619d15645e45d04b1c99550c6d292e9c
      
      --------------------------------
      
      commit 08389d88 upstream.
      
      Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
      If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.
      
      This still allows a transition of 2 -> {0,1} through an admin. Similarly,
      this also still keeps 1 -> {1} behavior intact, so that once set to permanently
      disabled, it cannot be undone aside from a reboot.
      
      We've also added extra2 with max of 2 for the procfs handler, so that an admin
      still has a chance to toggle between 0 <-> 2.
      
      Either way, as an additional alternative, applications can make use of CAP_BPF
      that we added a while ago.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net
      [fllinden@amazon.com: backported to 4.19]
      Signed-off-by: NFrank van der Linden <fllinden@amazon.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      25c90cb2
    • Z
      scsi: target: iscsi: Make sure the np under each tpg is unique · 33d04298
      ZouMingzhe 提交于
      stable inclusion
      from linux-4.19.230
      commit ded80123b84253ecc6c6cfd1fbbbd99f51c984ec
      
      --------------------------------
      
      [ Upstream commit a861790a ]
      
      iscsit_tpg_check_network_portal() has nested for_each loops and is supposed
      to return true when a match is found. However, the tpg loop will still
      continue after existing the tpg_np loop. If this tpg_np is not the last the
      match value will be changed.
      
      Break the outer loop after finding a match and make sure the np under each
      tpg is unique.
      
      Link: https://lore.kernel.org/r/20220111054742.19582-1-mingzhe.zou@easystack.cnSigned-off-by: NZouMingzhe <mingzhe.zou@easystack.cn>
      Reviewed-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      33d04298
    • O
      NFSv4 expose nfs_parse_server_name function · 4c895d11
      Olga Kornievskaia 提交于
      stable inclusion
      from linux-4.19.230
      commit 42dc3cf3174ec714732825638947e26923d9ea2a
      
      --------------------------------
      
      [ Upstream commit f5b27cc6 ]
      
      Make nfs_parse_server_name available outside of nfs4namespace.c.
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      4c895d11
    • O
      NFSv4 remove zero number of fs_locations entries error check · a3877f4a
      Olga Kornievskaia 提交于
      stable inclusion
      from linux-4.19.230
      commit 152f7db416c4bfcc8fc01e55cae60f63489580fa
      
      --------------------------------
      
      [ Upstream commit 90e12a31 ]
      
      Remove the check for the zero length fs_locations reply in the
      xdr decoding, and instead check for that in the migration code.
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      a3877f4a
    • T
      NFSv4.1: Fix uninitialised variable in devicenotify · 4cb6ed69
      Trond Myklebust 提交于
      stable inclusion
      from linux-4.19.230
      commit e6b0f9177c43ff9ad0f1a6dd3639c383373911b7
      
      --------------------------------
      
      [ Upstream commit b05bf5c6 ]
      
      When decode_devicenotify_args() exits with no entries, we need to
      ensure that the struct cb_devicenotifyargs is initialised to
      { 0, NULL } in order to avoid problems in
      nfs4_callback_devicenotify().
      
      Reported-by: <rtm@csail.mit.edu>
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      4cb6ed69
    • X
      nfs: nfs4clinet: check the return value of kstrdup() · 008ae6ae
      Xiaoke Wang 提交于
      stable inclusion
      from linux-4.19.230
      commit 2c9587f72ff4b502c2fd15eb3ccff388eae12d07
      
      --------------------------------
      
      [ Upstream commit fbd2057e ]
      
      kstrdup() returns NULL when some internal memory errors happen, it is
      better to check the return value of it so to catch the memory error in
      time.
      Signed-off-by: NXiaoke Wang <xkernel.wang@foxmail.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      008ae6ae
    • O
      NFSv4 only print the label when its queried · 9e9b47e9
      Olga Kornievskaia 提交于
      stable inclusion
      from linux-4.19.230
      commit 1789f59f1779c99d0756b40036c62a7519f53543
      
      --------------------------------
      
      [ Upstream commit 2c52c837 ]
      
      When the bitmask of the attributes doesn't include the security label,
      don't bother printing it. Since the label might not be null terminated,
      adjust the printing format accordingly.
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      9e9b47e9
    • T
      NFS: Fix initialisation of nfs_client cl_flags field · f22022eb
      Trond Myklebust 提交于
      stable inclusion
      from linux-4.19.230
      commit 57e13bdd9634a48c0bdbf988858144350321c0bf
      
      --------------------------------
      
      commit 468d126d upstream.
      
      For some long forgotten reason, the nfs_client cl_flags field is
      initialised in nfs_get_client() instead of being initialised at
      allocation time. This quirk was harmless until we moved the call to
      nfs_create_rpc_client().
      
      Fixes: dd99e9f9 ("NFSv4: Initialise connection to the server in nfs4_alloc_client()")
      Cc: stable@vger.kernel.org # 4.8.x
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
      f22022eb