1. 09 3月, 2021 14 次提交
  2. 07 3月, 2021 1 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 9270bbe2
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Fix incorrect enum type definition in nfnetlink_cthelper UAPI,
         from Dmitry V. Levin.
      
      2) Remove extra space in deprecated automatic helper assignment
         notice, from Klemen Košir.
      
      3) Drop early socket demux socket after NAT mangling, from
         Florian Westphal. Add a test to exercise this bug.
      
      4) Fix bogus invalid packet report in the conntrack TCP tracker,
         also from Florian.
      
      5) Fix access to xt[NFPROTO_UNSPEC] list with no mutex
         in target/match_revfn(), from Vasily Averin.
      
      6) Disallow updates on the table ownership flag.
      
      7) Fix double hook unregistration of tables with owner.
      
      8) Remove bogus check on the table owner in __nft_release_tables().
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9270bbe2
  3. 06 3月, 2021 15 次提交
    • J
      ethernet: alx: fix order of calls on resume · a4dcfbc4
      Jakub Kicinski 提交于
      netif_device_attach() will unpause the queues so we can't call
      it before __alx_open(). This went undetected until
      commit b0999223 ("alx: add ability to allocate and free
      alx_napi structures") but now if stack tries to xmit immediately
      on resume before __alx_open() we'll crash on the NAPI being null:
      
       BUG: kernel NULL pointer dereference, address: 0000000000000198
       CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G           OE 5.10.0-3-amd64 #1 Debian 5.10.13-1
       Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013
       RIP: 0010:alx_start_xmit+0x34/0x650 [alx]
       Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0
      0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df
       RSP: 0018:ffffb09240083d28 EFLAGS: 00010297
       RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004
       RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00
       RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700
       R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100
       R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00
       FS:  0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000
       i915 0000:00:02.0: vblank wait timed out on crtc 0
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0
       Call Trace:
        dev_hard_start_xmit+0xc7/0x1e0
        sch_direct_xmit+0x10f/0x310
      
      Cc: <stable@vger.kernel.org> # 4.9+
      Fixes: bc2bebe8 ("alx: remove WoL support")
      Reported-by: NZbynek Michl <zbynek.michl@gmail.com>
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Tested-by: NZbynek Michl <zbynek.michl@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4dcfbc4
    • G
      lan743x: trim all 4 bytes of the FCS; not just 2 · 3e21a10f
      George McCollister 提交于
      Trim all 4 bytes of the received FCS; not just 2 of them. Leaving 2
      bytes of the FCS on the frame breaks DSA tailing tag drivers.
      
      Fixes: a8db76d4 ("lan743x: boost performance on cpu archs w/o dma cache snooping")
      Signed-off-by: NGeorge McCollister <george.mccollister@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e21a10f
    • M
      gianfar: fix jumbo packets+napi+rx overrun crash · d8861bab
      Michael Braun 提交于
      When using jumbo packets and overrunning rx queue with napi enabled,
      the following sequence is observed in gfar_add_rx_frag:
      
         | lstatus                              |       | skb                   |
      t  | lstatus,  size, flags                | first | len, data_len, *ptr   |
      ---+--------------------------------------+-------+-----------------------+
      13 | 18002348, 9032, INTERRUPT LAST       | 0     | 9600, 8000,  f554c12e |
      12 | 10000640, 1600, INTERRUPT            | 0     | 8000, 6400,  f554c12e |
      11 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  f554c12e |
      10 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  f554c12e |
      09 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  f554c12e |
      08 | 14000640, 1600, INTERRUPT FIRST      | 0     | 1600, 0,     f554c12e |
      07 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     f554c12e |
      06 | 1c000080, 128,  INTERRUPT LAST FIRST | 1     | 0,    0,     abf3bd6e |
      05 | 18002348, 9032, INTERRUPT LAST       | 0     | 8000, 6400,  c5a57780 |
      04 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  c5a57780 |
      03 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  c5a57780 |
      02 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  c5a57780 |
      01 | 10000640, 1600, INTERRUPT            | 0     | 1600, 0,     c5a57780 |
      00 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     c5a57780 |
      
      So at t=7 a new packets is started but not finished, probably due to rx
      overrun - but rx overrun is not indicated in the flags. Instead a new
      packets starts at t=8. This results in skb->len to exceed size for the LAST
      fragment at t=13 and thus a negative fragment size added to the skb.
      
      This then crashes:
      
      kernel BUG at include/linux/skbuff.h:2277!
      Oops: Exception in kernel mode, sig: 5 [#1]
      ...
      NIP [c04689f4] skb_pull+0x2c/0x48
      LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
      Call Trace:
      [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
      [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
      [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
      [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
      [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
      [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
      [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
      [ec4bff08] [c0062718] kthread+0x140/0x144
      [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c
      
      This patch fixes this by checking for computed LAST fragment size, so a
      negative sized fragment is never added.
      In order to prevent the newer rx frame from getting corrupted, the FIRST
      flag is checked to discard the incomplete older frame.
      Signed-off-by: NMichael Braun <michael-dev@fami-braun.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8861bab
    • D
      sun/niu: fix wrong RXMAC_BC_FRM_CNT_COUNT count · 155b23e6
      Denis Efremov 提交于
      RXMAC_BC_FRM_CNT_COUNT added to mp->rx_bcasts twice in a row
      in niu_xmac_interrupt(). Remove the second addition.
      Signed-off-by: NDenis Efremov <efremov@linux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      155b23e6
    • D
      net/hamradio/6pack: remove redundant check in sp_encaps() · 85554bcd
      Denis Efremov 提交于
      "len > sp->mtu" checked twice in a row in sp_encaps().
      Remove the second check.
      Signed-off-by: NDenis Efremov <efremov@linux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85554bcd
    • H
      r8169: fix r8168fp_adjust_ocp_cmd function · abbf9a0e
      Hayes Wang 提交于
      The (0xBAF70000 & 0x00FFF000) << 6 should be (0xf70 << 18).
      
      Fixes: 561535b0 ("r8169: fix OCP access on RTL8117")
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Acked-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abbf9a0e
    • X
      selftest/net/ipsec.c: Remove unneeded semicolon · 0a7e0c3b
      Xu Wang 提交于
      fix semicolon.cocci warning:
      tools/testing/selftests/net/ipsec.c:1788:2-3: Unneeded semicolon
      Signed-off-by: NXu Wang <vulab@iscas.ac.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a7e0c3b
    • J
      ibmvnic: remove excessive irqsave · 69cdb794
      Junlin Yang 提交于
      ibmvnic_remove locks multiple spinlocks while disabling interrupts:
      spin_lock_irqsave(&adapter->state_lock, flags);
      spin_lock_irqsave(&adapter->rwi_lock, flags);
      
      As reported by coccinelle, the second _irqsave() overwrites the value
      saved in 'flags' by the first _irqsave(),   therefore when the second
      _irqrestore() comes,the value in 'flags' is not valid,the value saved
      by the first _irqsave() has been lost.
      This likely leads to IRQs remaining disabled. So remove the second
      _irqsave():
      spin_lock_irqsave(&adapter->state_lock, flags);
      spin_lock(&adapter->rwi_lock);
      
      Generated by: ./scripts/coccinelle/locks/flags.cocci
      ./drivers/net/ethernet/ibm/ibmvnic.c:5413:1-18:
      ERROR: nested lock+irqsave that reuses flags from line 5404.
      
      Fixes: 4a41c421 ("ibmvnic: serialize access to work queue on remove")
      Signed-off-by: NJunlin Yang <yangjunlin@yulong.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69cdb794
    • S
      CIPSO: Fix unaligned memory access in cipso_v4_gentag_hdr · e233febd
      Sergey Nazarov 提交于
      We need to use put_unaligned when writing 32-bit DOI value
      in cipso_v4_gentag_hdr to avoid unaligned memory access.
      
      v2: unneeded type cast removed as Ondrej Mosnacek suggested.
      Signed-off-by: NSergey Nazarov <s-nazarov@yandex.ru>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e233febd
    • W
      stmmac: intel: Fixes clock registration error seen for multiple interfaces · 8eb37ab7
      Wong Vee Khee 提交于
      Issue seen when enumerating multiple Intel mGbE interfaces in EHL.
      
      [    6.898141] intel-eth-pci 0000:00:1d.2: enabling device (0000 -> 0002)
      [    6.900971] intel-eth-pci 0000:00:1d.2: Fail to register stmmac-clk
      [    6.906434] intel-eth-pci 0000:00:1d.2: User ID: 0x51, Synopsys ID: 0x52
      
      We fix it by making the clock name to be unique following the format
      of stmmac-pci_name(pci_dev) so that we can differentiate the clock for
      these Intel mGbE interfaces in EHL platform as follow:
      
        /sys/kernel/debug/clk/stmmac-0000:00:1d.1
        /sys/kernel/debug/clk/stmmac-0000:00:1d.2
        /sys/kernel/debug/clk/stmmac-0000:00:1e.4
      
      Fixes: 58da0cfa ("net: stmmac: create dwmac-intel.c to contain all Intel platform")
      Signed-off-by: NWong Vee Khee <vee.khee.wong@intel.com>
      Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
      Co-developed-by: NOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8eb37ab7
    • O
      net: stmmac: Fix VLAN filter delete timeout issue in Intel mGBE SGMII · 9a7b3950
      Ong Boon Leong 提交于
      For Intel mGbE controller, MAC VLAN filter delete operation will time-out
      if serdes power-down sequence happened first during driver remove() with
      below message.
      
      [82294.764958] intel-eth-pci 0000:00:1e.4 eth2: stmmac_dvr_remove: removing driver
      [82294.778677] intel-eth-pci 0000:00:1e.4 eth2: Timeout accessing MAC_VLAN_Tag_Filter
      [82294.779997] intel-eth-pci 0000:00:1e.4 eth2: failed to kill vid 0081/0
      [82294.947053] intel-eth-pci 0000:00:1d.2 eth1: stmmac_dvr_remove: removing driver
      [82295.002091] intel-eth-pci 0000:00:1d.1 eth0: stmmac_dvr_remove: removing driver
      
      Therefore, we delay the serdes power-down to be after unregister_netdev()
      which triggers the VLAN filter delete.
      
      Fixes: b9663b7c ("net: stmmac: Enable SERDES power up/down sequence")
      Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a7b3950
    • J
      net: intel: iavf: fix error return code of iavf_init_get_resources() · 6650d31f
      Jia-Ju Bai 提交于
      When iavf_process_config() fails, no error return code of
      iavf_init_get_resources() is assigned.
      To fix this bug, err is assigned with the return value of
      iavf_process_config(), and then err is checked.
      Reported-by: NTOTE Robot <oslab@tsinghua.edu.cn>
      Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6650d31f
    • J
      net: tehuti: fix error return code in bdx_probe() · 38c26ff3
      Jia-Ju Bai 提交于
      When bdx_read_mac() fails, no error return code of bdx_probe()
      is assigned.
      To fix this bug, err is assigned with -EFAULT as error return code.
      Reported-by: NTOTE Robot <oslab@tsinghua.edu.cn>
      Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38c26ff3
    • K
      net/mlx4_en: update moderation when config reset · 00ff801b
      Kevin(Yudong) Yang 提交于
      This patch fixes a bug that the moderation config will not be
      applied when calling mlx4_en_reset_config. For example, when
      turning on rx timestamping, mlx4_en_reset_config() will be called,
      causing the NIC to forget previous moderation config.
      
      This fix is in phase with a previous fix:
      commit 79c54b6b ("net/mlx4_en: Fix TX moderation info loss
      after set_ringparam is called")
      
      Tested: Before this patch, on a host with NIC using mlx4, run
      netserver and stream TCP to the host at full utilization.
      $ sar -I SUM 1
                       INTR    intr/s
      14:03:56          sum  48758.00
      
      After rx hwtstamp is enabled:
      $ sar -I SUM 1
      14:10:38          sum 317771.00
      We see the moderation is not working properly and issued 7x more
      interrupts.
      
      After the patch, and turned on rx hwtstamp, the rate of interrupts
      is as expected:
      $ sar -I SUM 1
      14:52:11          sum  49332.00
      
      Fixes: 79c54b6b ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called")
      Signed-off-by: NKevin(Yudong) Yang <yyd@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NNeal Cardwell <ncardwell@google.com>
      CC: Tariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00ff801b
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 638526bb
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2021-03-04
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 7 non-merge commits during the last 4 day(s) which contain
      a total of 9 files changed, 128 insertions(+), 40 deletions(-).
      
      The main changes are:
      
      1) Fix 32-bit cmpxchg, from Brendan.
      
      2) Fix atomic+fetch logic, from Ilya.
      
      3) Fix usage of bpf_csum_diff in selftests, from Yauheni.
      ====================
      638526bb
  4. 05 3月, 2021 10 次提交
    • B
      bpf: Explicitly zero-extend R0 after 32-bit cmpxchg · 39491867
      Brendan Jackman 提交于
      As pointed out by Ilya and explained in the new comment, there's a
      discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
      the value from memory into r0, while x86 only does so when r0 and the
      value in memory are different. The same issue affects s390.
      
      At first this might sound like pure semantics, but it makes a real
      difference when the comparison is 32-bit, since the load will
      zero-extend r0/rax.
      
      The fix is to explicitly zero-extend rax after doing such a
      CMPXCHG. Since this problem affects multiple archs, this is done in
      the verifier by patching in a BPF_ZEXT_REG instruction after every
      32-bit cmpxchg. Any archs that don't need such manual zero-extension
      can do a look-ahead with insn_is_zext to skip the unnecessary mov.
      
      Note this still goes on top of Ilya's patch:
      
      https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u
      
      Differences v5->v6[1]:
       - Moved is_cmpxchg_insn and ensured it can be safely re-used. Also renamed it
         and removed 'inline' to match the style of the is_*_function helpers.
       - Fixed up comments in verifier test (thanks for the careful review, Martin!)
      
      Differences v4->v5[1]:
       - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
         for suggesting this.
      
      Differences v3->v4[1]:
       - Moved the optimization against pointless zext into the correct place:
         opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
      
      Differences v2->v3[1]:
       - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
       - Added extra commentary on bpf_jit_needs_zext
       - Added check to avoid adding a pointless zext(r0) if there's already one there.
      
      Difference v1->v2[1]: Now solved centrally in the verifier instead of
        specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
      
      [1] v5: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
          v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
          v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
          v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
          v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#tReported-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Fixes: 5ffa2550 ("bpf: Add instructions for atomic_[cmp]xchg")
      Signed-off-by: NBrendan Jackman <jackmanb@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Tested-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      39491867
    • P
      cipso,calipso: resolve a number of problems with the DOI refcounts · ad5d07f4
      Paul Moore 提交于
      The current CIPSO and CALIPSO refcounting scheme for the DOI
      definitions is a bit flawed in that we:
      
      1. Don't correctly match gets/puts in netlbl_cipsov4_list().
      2. Decrement the refcount on each attempt to remove the DOI from the
         DOI list, only removing it from the list once the refcount drops
         to zero.
      
      This patch fixes these problems by adding the missing "puts" to
      netlbl_cipsov4_list() and introduces a more conventional, i.e.
      not-buggy, refcounting mechanism to the DOI definitions.  Upon the
      addition of a DOI to the DOI list, it is initialized with a refcount
      of one, removing a DOI from the list removes it from the list and
      drops the refcount by one; "gets" and "puts" behave as expected with
      respect to refcounts, increasing and decreasing the DOI's refcount by
      one.
      
      Fixes: b1edeb10 ("netlabel: Replace protocol/NetLabel linking with refrerence counts")
      Fixes: d7cce015 ("netlabel: Add support for removing a CALIPSO DOI.")
      Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad5d07f4
    • J
      ibmvnic: always store valid MAC address · 67eb2114
      Jiri Wiesner 提交于
      The last change to ibmvnic_set_mac(), 8fc3672a, meant to prevent
      users from setting an invalid MAC address on an ibmvnic interface
      that has not been brought up yet. The change also prevented the
      requested MAC address from being stored by the adapter object for an
      ibmvnic interface when the state of the ibmvnic interface is
      VNIC_PROBED - that is after probing has finished but before the
      ibmvnic interface is brought up. The MAC address stored by the
      adapter object is used and sent to the hypervisor for checking when
      an ibmvnic interface is brought up.
      
      The ibmvnic driver ignoring the requested MAC address when in
      VNIC_PROBED state caused LACP bonds (bonds in 802.3ad mode) with more
      than one slave to malfunction. The bonding code must be able to
      change the MAC address of its slaves before they are brought up
      during enslaving. The inability of kernels with 8fc3672a to set
      the MAC addresses of bonding slaves is observable in the output of
      "ip address show". The MAC addresses of the slaves are the same as
      the MAC address of the bond on a working system whereas the slaves
      retain their original MAC addresses on a system with a malfunctioning
      LACP bond.
      
      Fixes: 8fc3672a ("ibmvnic: fix ibmvnic_set_mac")
      Signed-off-by: NJiri Wiesner <jwiesner@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67eb2114
    • H
      netdevsim: init u64 stats for 32bit hardware · 863a42b2
      Hillf Danton 提交于
      Init the u64 stats in order to avoid the lockdep prints on the 32bit
      hardware like
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 0 PID: 4695 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller #0
       Hardware name: ARM-Versatile Express
       Backtrace:
       [<826fc5b8>] (dump_backtrace) from [<826fc82c>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
       [<826fc814>] (show_stack) from [<8270d1f8>] (__dump_stack lib/dump_stack.c:79 [inline])
       [<826fc814>] (show_stack) from [<8270d1f8>] (dump_stack+0xa8/0xc8 lib/dump_stack.c:120)
       [<8270d150>] (dump_stack) from [<802bf9c0>] (assign_lock_key kernel/locking/lockdep.c:935 [inline])
       [<8270d150>] (dump_stack) from [<802bf9c0>] (register_lock_class+0xabc/0xb68 kernel/locking/lockdep.c:1247)
       [<802bef04>] (register_lock_class) from [<802baa2c>] (__lock_acquire+0x84/0x32d4 kernel/locking/lockdep.c:4711)
       [<802ba9a8>] (__lock_acquire) from [<802be840>] (lock_acquire.part.0+0xf0/0x554 kernel/locking/lockdep.c:5442)
       [<802be750>] (lock_acquire.part.0) from [<802bed10>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5415)
       [<802beca4>] (lock_acquire) from [<81560548>] (seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline])
       [<802beca4>] (lock_acquire) from [<81560548>] (__u64_stats_fetch_begin include/linux/u64_stats_sync.h:164 [inline])
       [<802beca4>] (lock_acquire) from [<81560548>] (u64_stats_fetch_begin include/linux/u64_stats_sync.h:175 [inline])
       [<802beca4>] (lock_acquire) from [<81560548>] (nsim_get_stats64+0xdc/0xf0 drivers/net/netdevsim/netdev.c:70)
       [<8156046c>] (nsim_get_stats64) from [<81e2efa0>] (dev_get_stats+0x44/0xd0 net/core/dev.c:10405)
       [<81e2ef5c>] (dev_get_stats) from [<81e53204>] (rtnl_fill_stats+0x38/0x120 net/core/rtnetlink.c:1211)
       [<81e531cc>] (rtnl_fill_stats) from [<81e59d58>] (rtnl_fill_ifinfo+0x6d4/0x148c net/core/rtnetlink.c:1783)
       [<81e59684>] (rtnl_fill_ifinfo) from [<81e5ceb4>] (rtmsg_ifinfo_build_skb+0x9c/0x108 net/core/rtnetlink.c:3798)
       [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3830 [inline])
       [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3821 [inline])
       [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo+0x44/0x70 net/core/rtnetlink.c:3839)
       [<81e5d068>] (rtmsg_ifinfo) from [<81e45c2c>] (register_netdevice+0x664/0x68c net/core/dev.c:10103)
       [<81e455c8>] (register_netdevice) from [<815608bc>] (nsim_create+0xf8/0x124 drivers/net/netdevsim/netdev.c:317)
       [<815607c4>] (nsim_create) from [<81561184>] (__nsim_dev_port_add+0x108/0x188 drivers/net/netdevsim/dev.c:941)
       [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_port_add_all drivers/net/netdevsim/dev.c:990 [inline])
       [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_probe+0x5cc/0x750 drivers/net/netdevsim/dev.c:1119)
       [<81561b0c>] (nsim_dev_probe) from [<815661dc>] (nsim_bus_probe+0x10/0x14 drivers/net/netdevsim/bus.c:287)
       [<815661cc>] (nsim_bus_probe) from [<811724c0>] (really_probe+0x100/0x50c drivers/base/dd.c:554)
       [<811723c0>] (really_probe) from [<811729c4>] (driver_probe_device+0xf8/0x1c8 drivers/base/dd.c:740)
       [<811728cc>] (driver_probe_device) from [<81172fe4>] (__device_attach_driver+0x8c/0xf0 drivers/base/dd.c:846)
       [<81172f58>] (__device_attach_driver) from [<8116fee0>] (bus_for_each_drv+0x88/0xd8 drivers/base/bus.c:431)
       [<8116fe58>] (bus_for_each_drv) from [<81172c6c>] (__device_attach+0xdc/0x1d0 drivers/base/dd.c:914)
       [<81172b90>] (__device_attach) from [<8117305c>] (device_initial_probe+0x14/0x18 drivers/base/dd.c:961)
       [<81173048>] (device_initial_probe) from [<81171358>] (bus_probe_device+0x90/0x98 drivers/base/bus.c:491)
       [<811712c8>] (bus_probe_device) from [<8116e77c>] (device_add+0x320/0x824 drivers/base/core.c:3109)
       [<8116e45c>] (device_add) from [<8116ec9c>] (device_register+0x1c/0x20 drivers/base/core.c:3182)
       [<8116ec80>] (device_register) from [<81566710>] (nsim_bus_dev_new drivers/net/netdevsim/bus.c:336 [inline])
       [<8116ec80>] (device_register) from [<81566710>] (new_device_store+0x178/0x208 drivers/net/netdevsim/bus.c:215)
       [<81566598>] (new_device_store) from [<8116fcb4>] (bus_attr_store+0x2c/0x38 drivers/base/bus.c:122)
       [<8116fc88>] (bus_attr_store) from [<805b4b8c>] (sysfs_kf_write+0x48/0x54 fs/sysfs/file.c:139)
       [<805b4b44>] (sysfs_kf_write) from [<805b3c90>] (kernfs_fop_write_iter+0x128/0x1ec fs/kernfs/file.c:296)
       [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (call_write_iter include/linux/fs.h:1901 [inline])
       [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (new_sync_write fs/read_write.c:518 [inline])
       [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (vfs_write+0x3dc/0x57c fs/read_write.c:605)
       [<804d1f20>] (vfs_write) from [<804d2604>] (ksys_write+0x68/0xec fs/read_write.c:658)
       [<804d259c>] (ksys_write) from [<804d2698>] (__do_sys_write fs/read_write.c:670 [inline])
       [<804d259c>] (ksys_write) from [<804d2698>] (sys_write+0x10/0x14 fs/read_write.c:667)
       [<804d2688>] (sys_write) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64)
      
      Fixes: 83c9e13a ("netdevsim: add software driver for testing offloads")
      Reported-by: syzbot+e74a6857f2d0efe3ad81@syzkaller.appspotmail.com
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NHillf Danton <hdanton@sina.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      863a42b2
    • D
      Merge branch 'mptcp-fixes' · bdda7dfa
      David S. Miller 提交于
      Mat Martineau says:
      
      ====================
      mptcp: Fixes for v5.12
      
      These patches from the MPTCP tree fix a few multipath TCP issues:
      
      Patches 1 and 5 clear some stale pointers when subflows close.
      
      Patches 2, 4, and 9 plug some memory leaks.
      
      Patch 3 fixes a memory accounting error identified by syzkaller.
      
      Patches 6 and 7 fix a race condition that slowed data transmission.
      
      Patch 8 adds missing wakeups when write buffer space is freed.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdda7dfa
    • G
      mptcp: free resources when the port number is mismatched · 9238e900
      Geliang Tang 提交于
      When the port number is mismatched with the announced ones, use
      'goto dispose_child' to free the resources instead of using 'goto out'.
      
      This patch also moves the port number checking code in
      subflow_syn_recv_sock before mptcp_finish_join, otherwise subflow_drop_ctx
      will fail in dispose_child.
      
      Fixes: 5bc56388 ("mptcp: add port number check for MP_JOIN")
      Reported-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9238e900
    • P
      mptcp: fix missing wakeup · 417789df
      Paolo Abeni 提交于
      __mptcp_clean_una() can free write memory and should wake-up
      user-space processes when needed.
      
      When such function is invoked by the MPTCP receive path, the wakeup
      is not needed, as the TCP stack will later trigger subflow_write_space
      which will do the wakeup as needed.
      
      Other __mptcp_clean_una() call sites need an additional wakeup check
      Let's bundle the relevant code in a new helper and use it.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/165
      Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks")
      Fixes: 64b9cea7 ("mptcp: fix spurious retransmissions")
      Tested-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      417789df
    • P
      mptcp: fix race in release_cb · c2e6048f
      Paolo Abeni 提交于
      If we receive a MPTCP_PUSH_PENDING even from a subflow when
      mptcp_release_cb() is serving the previous one, the latter
      will be delayed up to the next release_sock(msk).
      
      Address the issue implementing a test/serve loop for such
      event.
      
      Additionally rename the push helper to __mptcp_push_pending()
      to be more consistent with the existing code.
      
      Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2e6048f
    • P
      mptcp: factor out __mptcp_retrans helper() · 2948d0a1
      Paolo Abeni 提交于
      Will simplify the following patch, no functional change
      intended.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2948d0a1
    • F
      mptcp: reset 'first' and ack_hint on subflow close · c8fe62f0
      Florian Westphal 提交于
      Just like with last_snd, we have to NULL 'first' on subflow close.
      
      ack_hint isn't strictly required (its never dereferenced), but better to
      clear this explicitly as well instead of making it an exception.
      
      msk->first is dereferenced unconditionally at accept time, but
      at that point the ssk is not on the conn_list yet -- this means
      worker can't see it when iterating the conn_list.
      Reported-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8fe62f0