1. 06 1月, 2021 25 次提交
    • J
      net: cdc_ncm: correct overhead in delayed_ndp_size · 7a68d725
      Jouni K. Seppänen 提交于
      Aligning to tx_ndp_modulus is not sufficient because the next align
      call can be cdc_ncm_align_tail, which can add up to ctx->tx_modulus +
      ctx->tx_remainder - 1 bytes. This used to lead to occasional crashes
      on a Huawei 909s-120 LTE module as follows:
      
      - the condition marked /* if there is a remaining skb [...] */ is true
        so the swaps happen
      - skb_out is set from ctx->tx_curr_skb
      - skb_out->len is exactly 0x3f52
      - ctx->tx_curr_size is 0x4000 and delayed_ndp_size is 0xac
        (note that the sum of skb_out->len and delayed_ndp_size is 0x3ffe)
      - the for loop over n is executed once
      - the cdc_ncm_align_tail call marked /* align beginning of next frame */
        increases skb_out->len to 0x3f56 (the sum is now 0x4002)
      - the condition marked /* check if we had enough room left [...] */ is
        false so we break out of the loop
      - the condition marked /* If requested, put NDP at end of frame. */ is
        true so the NDP is written into skb_out
      - now skb_out->len is 0x4002, so padding_count is minus two interpreted
        as an unsigned number, which is used as the length argument to memset,
        leading to a crash with various symptoms but usually including
      
      > Call Trace:
      >  <IRQ>
      >  cdc_ncm_fill_tx_frame+0x83a/0x970 [cdc_ncm]
      >  cdc_mbim_tx_fixup+0x1d9/0x240 [cdc_mbim]
      >  usbnet_start_xmit+0x5d/0x720 [usbnet]
      
      The cdc_ncm_align_tail call first aligns on a ctx->tx_modulus
      boundary (adding at most ctx->tx_modulus-1 bytes), then adds
      ctx->tx_remainder bytes. Alternatively, the next alignment call can
      occur in cdc_ncm_ndp16 or cdc_ncm_ndp32, in which case at most
      ctx->tx_ndp_modulus-1 bytes are added.
      
      A similar problem has occurred before, and the code is nontrivial to
      reason about, so add a guard before the crashing call. By that time it
      is too late to prevent any memory corruption (we'll have written past
      the end of the buffer already) but we can at least try to get a warning
      written into an on-disk log by avoiding the hard crash caused by padding
      past the buffer with a huge number of zeros.
      Signed-off-by: NJouni K. Seppänen <jks@iki.fi>
      Fixes: 4a0e3e98 ("cdc_ncm: Add support for moving NDP to end of NCM frame")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=209407Reported-by: Nkernel test robot <lkp@intel.com>
      Reviewed-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a68d725
    • D
      Merge branch 'hns3-fixes' · be8d1e0e
      David S. Miller 提交于
      Huazhong Tan says:
      
      ====================
      net: hns3: fixes for -net
      
      There are some bugfixes for the HNS3 ethernet driver.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be8d1e0e
    • J
      net: hns3: fix incorrect handling of sctp6 rss tuple · ab6e32d2
      Jian Shen 提交于
      For DEVICE_VERSION_V2, the hardware only supports src-ip,
      dst-ip and verification-tag for rss tuple set of sctp6
      packet. For DEVICE_VERSION_V3, the hardware supports
      src-port and dst-port as well.
      
      Currently, when user queries the sctp6 rss tuples info,
      some unsupported information will be showed on V2. So add
      a check for hardware version when initializing and queries
      sctp6 rss tuple to fix this issue.
      
      Fixes: 46a3df9f ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
      Signed-off-by: NJian Shen <shenjian15@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab6e32d2
    • Y
      net: hns3: fix the number of queues actually used by ARQ · 65e61e3c
      Yufeng Mo 提交于
      HCLGE_MBX_MAX_ARQ_MSG_NUM is used to apply memory for the number
      of queues used by ARQ(Asynchronous Receive Queue), so the head
      and tail pointers should also use this macro.
      
      Fixes: 07a0556a ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)")
      Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65e61e3c
    • Y
      net: hns3: fix a phy loopback fail issue · f04bbcbf
      Yonglong Liu 提交于
      When phy driver does not implement the set_loopback interface,
      phy loopback test will return -EOPNOTSUPP, and the loopback test
      will fail. So when phy driver does not implement the set_loopback
      interface, don't do phy loopback test.
      
      Fixes: c9765a89 ("net: hns3: add phy selftest function")
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f04bbcbf
    • J
      docs: net: fix documentation on .ndo_get_stats · 9f9d41f0
      Jakub Kicinski 提交于
      Fix calling context.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f9d41f0
    • D
      Merge branch 'stmmac-fixes' · 8db25530
      David S. Miller 提交于
      Samuel Holland says:
      
      ====================
      Fixes for dwmac-sun8i suspend/resume
      
      This series fixes issues preventing dwmac-sun8i from working after a
      suspend/resume cycle. Those issues include the PHY being left powered
      off, the MAC syscon configuration being reset, and the reference to the
      reset controller being improperly dropped. They also fix related issues
      in probe error handling and driver removal.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8db25530
    • S
      net: stmmac: dwmac-sun8i: Balance syscon (de)initialization · 9b1e39cf
      Samuel Holland 提交于
      Previously, sun8i_dwmac_set_syscon was called from a chain of functions
      in several different files:
          sun8i_dwmac_probe
            stmmac_dvr_probe
              stmmac_hw_init
                stmmac_hwif_init
                  sun8i_dwmac_setup
                    sun8i_dwmac_set_syscon
      which made the lifetime of the syscon values hard to reason about. Part
      of the problem is that there is no similar platform driver callback from
      stmmac_dvr_remove. As a result, the driver unset the syscon value in
      sun8i_dwmac_exit, but this leaves it uninitialized after a suspend/
      resume cycle. It was also unset a second time (outside sun8i_dwmac_exit)
      in the probe error path.
      
      Move the init to the earliest available place in sun8i_dwmac_probe
      (after stmmac_probe_config_dt, which initializes plat_dat), and the
      deinit to the corresponding position in the cleanup order.
      
      Since priv is not filled in until stmmac_dvr_probe, this requires
      changing the sun8i_dwmac_set_syscon parameters to priv's two relevant
      members.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Fixes: 634db83b ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs")
      Signed-off-by: NSamuel Holland <samuel@sholland.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b1e39cf
    • S
      net: stmmac: dwmac-sun8i: Balance internal PHY power · b8239638
      Samuel Holland 提交于
      sun8i_dwmac_exit calls sun8i_dwmac_unpower_internal_phy, but
      sun8i_dwmac_init did not call sun8i_dwmac_power_internal_phy. This
      caused PHY power to remain off after a suspend/resume cycle. Fix this by
      recording if PHY power should be restored, and if so, restoring it.
      
      Fixes: 634db83b ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs")
      Signed-off-by: NSamuel Holland <samuel@sholland.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8239638
    • S
      net: stmmac: dwmac-sun8i: Balance internal PHY resource references · 52925421
      Samuel Holland 提交于
      While stmmac_pltfr_remove calls sun8i_dwmac_exit, the sun8i_dwmac_init
      and sun8i_dwmac_exit functions are also called by the stmmac_platform
      suspend/resume callbacks. They may be called many times during the
      device's lifetime and should not release resources used by the driver.
      
      Furthermore, there was no error handling in case registering the MDIO
      mux failed during probe, and the EPHY clock was never released at all.
      
      Fix all of these issues by moving the deinitialization code to a driver
      removal callback. Also ensure the EPHY is powered down before removal.
      
      Fixes: 634db83b ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs")
      Signed-off-by: NSamuel Holland <samuel@sholland.org>
      Reviewed-by: NChen-Yu Tsai <wens@csie.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52925421
    • S
      net: stmmac: dwmac-sun8i: Fix probe error handling · 7eeecc4b
      Samuel Holland 提交于
      stmmac_pltfr_remove does three things in one function, making it
      inapproprate for unwinding the steps in the probe function. Currently,
      a failure before the call to stmmac_dvr_probe would leak OF node
      references due to missing a call to stmmac_remove_config_dt. And an
      error in stmmac_dvr_probe would cause the driver to attempt to remove a
      netdevice that was never added. Fix these by reordering the init and
      splitting out the error handling steps.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Fixes: 40a1dcee ("net: ethernet: dwmac-sun8i: Use the correct function in exit path")
      Signed-off-by: NSamuel Holland <samuel@sholland.org>
      Reviewed-by: NChen-Yu Tsai <wens@csie.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7eeecc4b
    • J
      net: vlan: avoid leaks on register_vlan_dev() failures · 55b7ab11
      Jakub Kicinski 提交于
      VLAN checks for NETREG_UNINITIALIZED to distinguish between
      registration failure and unregistration in progress.
      
      Since commit cb626bf5 ("net-sysfs: Fix reference count leak")
      registration failure may, however, result in NETREG_UNREGISTERED
      as well as NETREG_UNINITIALIZED.
      
      This fix is similer to cebb6975 ("rtnetlink: Fix
      memory(net_device) leak when ->newlink fails")
      
      Fixes: cb626bf5 ("net-sysfs: Fix reference count leak")
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55b7ab11
    • J
      net: suggest L2 discards be counted towards rx_dropped · cf072069
      Jakub Kicinski 提交于
      From the existing definitions it's unclear which stat to
      use to report filtering based on L2 dst addr in old
      broadcast-medium Ethernet.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf072069
    • C
      net/sonic: Fix some resource leaks in error handling paths · 0f7ba7bc
      Christophe JAILLET 提交于
      A call to dma_alloc_coherent() is wrapped by sonic_alloc_descriptors().
      
      This is correctly freed in the remove function, but not in the error
      handling path of the probe function. Fix this by adding the missing
      dma_free_coherent() call.
      
      While at it, rename a label in order to be slightly more informative.
      
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Chris Zankel <chris@zankel.net>
      References: commit 10e3cc18 ("net/sonic: Fix a resource leak in an error handling path in 'jazz_sonic_probe()'")
      Fixes: 74f2a5f0 ("xtensa: Add support for the Sonic Ethernet device for the XT2000 board.")
      Fixes: efcce839 ("[PATCH] macsonic/jazzsonic network drivers update")
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f7ba7bc
    • A
      wan: ds26522: select CONFIG_BITREVERSE · 69931e11
      Arnd Bergmann 提交于
      Without this, the driver runs into a link failure
      
      arm-linux-gnueabi-ld: drivers/net/wan/slic_ds26522.o: in function `slic_ds26522_probe':
      slic_ds26522.c:(.text+0x100c): undefined reference to `byte_rev_table'
      arm-linux-gnueabi-ld: slic_ds26522.c:(.text+0x1cdc): undefined reference to `byte_rev_table'
      arm-linux-gnueabi-ld: drivers/net/wan/slic_ds26522.o: in function `slic_write':
      slic_ds26522.c:(.text+0x1e4c): undefined reference to `byte_rev_table'
      
      Fixes: c37d4a00 ("Maxim/driver: Add driver for maxim ds26522")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69931e11
    • A
      misdn: dsp: select CONFIG_BITREVERSE · 51049bd9
      Arnd Bergmann 提交于
      Without this, we run into a link error
      
      arm-linux-gnueabi-ld: drivers/isdn/mISDN/dsp_audio.o: in function `dsp_audio_generate_law_tables':
      (.text+0x30c): undefined reference to `byte_rev_table'
      arm-linux-gnueabi-ld: drivers/isdn/mISDN/dsp_audio.o:(.text+0x5e4): more undefined references to `byte_rev_table' follow
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51049bd9
    • A
      cfg80211: select CONFIG_CRC32 · 152a8a6c
      Arnd Bergmann 提交于
      Without crc32 support, this fails to link:
      
      arm-linux-gnueabi-ld: net/wireless/scan.o: in function `cfg80211_scan_6ghz':
      scan.c:(.text+0x928): undefined reference to `crc32_le'
      
      Fixes: c8cb5b85 ("nl80211/cfg80211: support 6 GHz scanning")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      152a8a6c
    • A
      wil6210: select CONFIG_CRC32 · e186620d
      Arnd Bergmann 提交于
      Without crc32, the driver fails to link:
      
      arm-linux-gnueabi-ld: drivers/net/wireless/ath/wil6210/fw.o: in function `wil_fw_verify':
      fw.c:(.text+0x74c): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/wireless/ath/wil6210/fw.o:fw.c:(.text+0x758): more undefined references to `crc32_le' follow
      
      Fixes: 151a9706 ("wil6210: firmware download")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e186620d
    • A
      can: kvaser_pciefd: select CONFIG_CRC32 · 1d48595c
      Arnd Bergmann 提交于
      Without crc32, this driver fails to link:
      
      arm-linux-gnueabi-ld: drivers/net/can/kvaser_pciefd.o: in function `kvaser_pciefd_probe':
      kvaser_pciefd.c:(.text+0x2b0): undefined reference to `crc32_be'
      
      Fixes: 26ad340e ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d48595c
    • A
      phy: dp83640: select CONFIG_CRC32 · f9d6f941
      Arnd Bergmann 提交于
      Without crc32, this driver fails to link:
      
      arm-linux-gnueabi-ld: drivers/net/phy/dp83640.o: in function `match':
      dp83640.c:(.text+0x476c): undefined reference to `crc32_le'
      
      Fixes: 539e44d2 ("dp83640: Include hash in timestamp/packet matching")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9d6f941
    • A
      qed: select CONFIG_CRC32 · 2860d45a
      Arnd Bergmann 提交于
      Without this, the driver fails to link:
      
      lpc_eth.c:(.text+0x1934): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/ethernet/qlogic/qed/qed_debug.o: in function `qed_grc_dump':
      qed_debug.c:(.text+0x4068): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/ethernet/qlogic/qed/qed_debug.o: in function `qed_idle_chk_dump':
      qed_debug.c:(.text+0x51fc): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/ethernet/qlogic/qed/qed_debug.o: in function `qed_mcp_trace_dump':
      qed_debug.c:(.text+0x6000): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/ethernet/qlogic/qed/qed_debug.o: in function `qed_dbg_reg_fifo_dump':
      qed_debug.c:(.text+0x66cc): undefined reference to `crc32_le'
      arm-linux-gnueabi-ld: drivers/net/ethernet/qlogic/qed/qed_debug.o:qed_debug.c:(.text+0x6aa4): more undefined references to `crc32_le' follow
      
      Fixes: 7a4b21b7 ("qed: Add nvram selftest")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2860d45a
    • L
      Merge tag 'arc-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · f6e7a024
      Linus Torvalds 提交于
      Pull ARC updates from Vineet Gupta:
       "Things are quieter on upstreaming front as we are mostly focusing on
        ARCv3/ARC64 port.
      
        This contains just build system updates from Masahiro Yamada"
      
      * tag 'arc-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: build: use $(READELF) instead of hard-coded readelf
        ARC: build: remove unneeded extra-y
        ARC: build: move symlink creation to arch/arc/Makefile to avoid race
        ARC: build: add boot_targets to PHONY
        ARC: build: add uImage.lzma to the top-level target
        ARC: build: remove non-existing bootpImage from KBUILD_IMAGE
      f6e7a024
    • L
      Merge tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · aa35e45c
      Linus Torvalds 提交于
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes, including fixes from netfilter, wireless and bpf
        trees.
      
        Current release - regressions:
      
         - mt76: fix NULL pointer dereference in mt76u_status_worker and
           mt76s_process_tx_queue
      
         - net: ipa: fix interconnect enable bug
      
        Current release - always broken:
      
         - netfilter: fixes possible oops in mtype_resize in ipset
      
         - ath11k: fix number of coding issues found by static analysis tools
           and spurious error messages
      
        Previous releases - regressions:
      
         - e1000e: re-enable s0ix power saving flows for systems with the
           Intel i219-LM Ethernet controllers to fix power use regression
      
         - virtio_net: fix recursive call to cpus_read_lock() to avoid a
           deadlock
      
         - ipv4: ignore ECN bits for fib lookups in fib_compute_spec_dst()
      
         - sysfs: take the rtnl lock around XPS configuration
      
         - xsk: fix memory leak for failed bind and rollback reservation at
           NETDEV_TX_BUSY
      
         - r8169: work around power-saving bug on some chip versions
      
        Previous releases - always broken:
      
         - dcb: validate netlink message in DCB handler
      
         - tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
           to prevent unnecessary retries
      
         - vhost_net: fix ubuf refcount when sendmsg fails
      
         - bpf: save correct stopping point in file seq iteration
      
         - ncsi: use real net-device for response handler
      
         - neighbor: fix div by zero caused by a data race (TOCTOU)
      
         - bareudp: fix use of incorrect min_headroom size and a false
           positive lockdep splat from the TX lock
      
         - mvpp2:
            - clear force link UP during port init procedure in case
              bootloader had set it
            - add TCAM entry to drop flow control pause frames
            - fix PPPoE with ipv6 packet parsing
            - fix GoP Networking Complex Control config of port 3
            - fix pkt coalescing IRQ-threshold configuration
      
         - xsk: fix race in SKB mode transmit with shared cq
      
         - ionic: account for vlan tag len in rx buffer len
      
         - stmmac: ignore the second clock input, current clock framework does
           not handle exclusive clock use well, other drivers may reconfigure
           the second clock
      
        Misc:
      
         - ppp: change PPPIOCUNBRIDGECHAN ioctl request number to follow
           existing scheme"
      
      * tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
        net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
        net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
        net: lapb: Decrease the refcount of "struct lapb_cb" in lapb_device_event
        r8169: work around power-saving bug on some chip versions
        net: usb: qmi_wwan: add Quectel EM160R-GL
        selftests: mlxsw: Set headroom size of correct port
        net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag
        ibmvnic: fix: NULL pointer dereference.
        docs: networking: packet_mmap: fix old config reference
        docs: networking: packet_mmap: fix formatting for C macros
        vhost_net: fix ubuf refcount incorrectly when sendmsg fails
        bareudp: Fix use of incorrect min_headroom size
        bareudp: set NETIF_F_LLTX flag
        net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
        atlantic: remove architecture depends
        erspan: fix version 1 check in gre_parse_header()
        net: hns: fix return value check in __lb_other_process()
        net: sched: prevent invalid Scell_log shift count
        net: neighbor: fix a crash caused by mod zero
        ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
        ...
      aa35e45c
    • L
      Merge tag 'afs-fixes-04012021' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 6207214a
      Linus Torvalds 提交于
      Pull AFS fixes from David Howells:
       "Two fixes.
      
        The first is the fix for the strnlen() array limit check and the
        second fixes the calculation of the number of dirent records used to
        represent any particular filename length"
      
      * tag 'afs-fixes-04012021' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix directory entry size calculation
        afs: Work around strnlen() oops with CONFIG_FORTIFIED_SOURCE=y
      6207214a
    • L
      mm: make wait_on_page_writeback() wait for multiple pending writebacks · c2407cf7
      Linus Torvalds 提交于
      Ever since commit 2a9127fc ("mm: rewrite wait_on_page_bit_common()
      logic") we've had some very occasional reports of BUG_ON(PageWriteback)
      in write_cache_pages(), which we thought we already fixed in commit
      073861ed ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)").
      
      But syzbot just reported another one, even with that commit in place.
      
      And it turns out that there's a simpler way to trigger the BUG_ON() than
      the one Hugh found with page re-use.  It all boils down to the fact that
      the page writeback is ostensibly serialized by the page lock, but that
      isn't actually really true.
      
      Yes, the people _setting_ writeback all do so under the page lock, but
      the actual clearing of the bit - and waking up any waiters - happens
      without any page lock.
      
      This gives us this fairly simple race condition:
      
        CPU1 = end previous writeback
        CPU2 = start new writeback under page lock
        CPU3 = write_cache_pages()
      
        CPU1          CPU2            CPU3
        ----          ----            ----
      
        end_page_writeback()
          test_clear_page_writeback(page)
          ... delayed...
      
                      lock_page();
                      set_page_writeback()
                      unlock_page()
      
                                      lock_page()
                                      wait_on_page_writeback();
      
          wake_up_page(page, PG_writeback);
          .. wakes up CPU3 ..
      
                                      BUG_ON(PageWriteback(page));
      
      where the BUG_ON() happens because we woke up the PG_writeback bit
      becasue of the _previous_ writeback, but a new one had already been
      started because the clearing of the bit wasn't actually atomic wrt the
      actual wakeup or serialized by the page lock.
      
      The reason this didn't use to happen was that the old logic in waiting
      on a page bit would just loop if it ever saw the bit set again.
      
      The nice proper fix would probably be to get rid of the whole "wait for
      writeback to clear, and then set it" logic in the writeback path, and
      replace it with an atomic "wait-to-set" (ie the same as we have for page
      locking: we set the page lock bit with a single "lock_page()", not with
      "wait for lock bit to clear and then set it").
      
      However, out current model for writeback is that the waiting for the
      writeback bit is done by the generic VFS code (ie write_cache_pages()),
      but the actual setting of the writeback bit is done much later by the
      filesystem ".writepages()" function.
      
      IOW, to make the writeback bit have that same kind of "wait-to-set"
      behavior as we have for page locking, we'd have to change our roughly
      ~50 different writeback functions.  Painful.
      
      Instead, just make "wait_on_page_writeback()" loop on the very unlikely
      situation that the PG_writeback bit is still set, basically re-instating
      the old behavior.  This is very non-optimal in case of contention, but
      since we only ever set the bit under the page lock, that situation is
      controlled.
      
      Reported-by: syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com
      Fixes: 2a9127fc ("mm: rewrite wait_on_page_bit_common() logic")
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2407cf7
  2. 05 1月, 2021 15 次提交
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · a8f33c03
      Jakub Kicinski 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Missing sanitization of rateest userspace string, bug has been
         triggered by syzbot, patch from Florian Westphal.
      
      2) Report EOPNOTSUPP on missing set features in nft_dynset, otherwise
         error reporting to userspace via EINVAL is misleading since this is
         reserved for malformed netlink requests.
      
      3) New binaries with old kernels might silently accept several set
         element expressions. New binaries set on the NFT_SET_EXPR and
         NFT_DYNSET_F_EXPR flags to request for several expressions per
         element, hence old kernels which do not support for this bail out
         with EOPNOTSUPP.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
        netfilter: nftables: add set expression flags
        netfilter: nft_dynset: report EOPNOTSUPP on missing set feature
        netfilter: xt_RATEEST: reject non-null terminated string from userspace
      ====================
      
      Link: https://lore.kernel.org/r/20210103192920.18639-1-pablo@netfilter.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      a8f33c03
    • J
      Merge branch 'net-dsa-lantiq_gswip-two-fixes-for-net-stable' · 08ad4839
      Jakub Kicinski 提交于
      Martin Blumenstingl says:
      
      ====================
      net: dsa: lantiq_gswip: two fixes for -net/-stable
      
      While testing the lantiq_gswip driver in OpenWrt at least one board had
      a non-working Ethernet port connected to an internal 100Mbit/s PHY22F
      GPHY. The problem which could be observed:
      - the PHY would detect the link just fine
      - ethtool stats would see the TX counter rise
      - the RX counter in ethtool was stuck at zero
      
      It turns out that two independent patches are needed to fix this:
      - first we need to enable the MII data lines also for internal PHYs
      - second we need to program the GSWIP_MII_CFG registers for all ports
        except the CPU port
      
      These two patches have also been tested by back-porting them on top of
      Linux 5.4.86 in OpenWrt.
      
      Special thanks to Hauke for debugging and brainstorming this on IRC
      with me!
      ====================
      
      Link: https://lore.kernel.org/r/20210103012544.3259029-1-martin.blumenstingl@googlemail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      08ad4839
    • M
      net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access · 709a3c9d
      Martin Blumenstingl 提交于
      There is one GSWIP_MII_CFG register for each switch-port except the CPU
      port. The register offset for the first port is 0x0, 0x02 for the
      second, 0x04 for the third and so on.
      
      Update the driver to not only restrict the GSWIP_MII_CFG registers to
      ports 0, 1 and 5. Handle ports 0..5 instead but skip the CPU port. This
      means we are not overwriting the configuration for the third port (port
      two since we start counting from zero) with the settings for the sixth
      port (with number five) anymore.
      
      The GSWIP_MII_PCDU(p) registers are not updated because there's really
      only three (one for each of the following ports: 0, 1, 5).
      
      Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
      Signed-off-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Acked-by: NHauke Mehrtens <hauke@hauke-m.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      709a3c9d
    • M
      net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs · c1a9ec7e
      Martin Blumenstingl 提交于
      Enable GSWIP_MII_CFG_EN also for internal PHYs to make traffic flow.
      Without this the PHY link is detected properly and ethtool statistics
      for TX are increasing but there's no RX traffic coming in.
      
      Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
      Suggested-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Acked-by: NHauke Mehrtens <hauke@hauke-m.de>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      c1a9ec7e
    • X
      net: lapb: Decrease the refcount of "struct lapb_cb" in lapb_device_event · b40f97b9
      Xie He 提交于
      In lapb_device_event, lapb_devtostruct is called to get a reference to
      an object of "struct lapb_cb". lapb_devtostruct increases the refcount
      of the object and returns a pointer to it. However, we didn't decrease
      the refcount after we finished using the pointer. This patch fixes this
      problem.
      
      Fixes: a4989fa9 ("net/lapb: support netdev events")
      Cc: Martin Schiller <ms@dev.tdt.de>
      Signed-off-by: NXie He <xie.he.0141@gmail.com>
      Link: https://lore.kernel.org/r/20201231174331.64539-1-xie.he.0141@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      b40f97b9
    • H
      r8169: work around power-saving bug on some chip versions · e80bd76f
      Heiner Kallweit 提交于
      A user reported failing network with RTL8168dp (a quite rare chip
      version). Realtek confirmed that few chip versions suffer from a PLL
      power-down hw bug.
      
      Fixes: 07df5bd8 ("r8169: power down chip in probe")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/a1c39460-d533-7f9e-fa9d-2b8990b02426@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e80bd76f
    • B
      net: usb: qmi_wwan: add Quectel EM160R-GL · cfd82dfc
      Bjørn Mork 提交于
      New modem using ff/ff/30 for QCDM, ff/00/00 for  AT and NMEA,
      and ff/ff/ff for RMNET/QMI.
      
      T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0
      D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1
      P: Vendor=2c7c ProdID=0620 Rev= 4.09
      S: Manufacturer=Quectel
      S: Product=EM160R-GL
      S: SerialNumber=e31cedc1
      C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
      E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
      E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
      E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
      E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
      E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Link: https://lore.kernel.org/r/20201230152451.245271-1-bjorn@mork.noSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      cfd82dfc
    • I
      selftests: mlxsw: Set headroom size of correct port · 2ff2c7e2
      Ido Schimmel 提交于
      The test was setting the headroom size of the wrong port. This was not
      visible because of a firmware bug that canceled this bug.
      
      Set the headroom size of the correct port, so that the test will pass
      with both old and new firmware versions.
      
      Fixes: bfa80478 ("selftests: mlxsw: Add a PFC test")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/20201230114251.394009-1-idosch@idosch.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2ff2c7e2
    • C
      net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag · 1d0d561a
      Charles Keepax 提交于
      A new flag MACB_CAPS_CLK_HW_CHG was added and all callers of
      macb_set_tx_clk were gated on the presence of this flag.
      
      -   if (!clk)
      + if (!bp->tx_clk || !(bp->caps & MACB_CAPS_CLK_HW_CHG))
      
      However the flag was not added to anything other than the new
      sama7g5_gem, turning that function call into a no op for all other
      systems. This breaks the networking on Zynq.
      
      The commit message adding this states: a new capability so that
      macb_set_tx_clock() to not be called for IPs having this
      capability
      
      This strongly implies that present of the flag was intended to skip
      the function not absence of the flag. Update the if statement to
      this effect, which repairs the existing users.
      
      Fixes: daafa1d3 ("net: macb: add capability to not set the clock rate")
      Suggested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NCharles Keepax <ckeepax@opensource.cirrus.com>
      Reviewed-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20210104103802.13091-1-ckeepax@opensource.cirrus.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      1d0d561a
    • Y
      ibmvnic: fix: NULL pointer dereference. · 862aecbd
      YANG LI 提交于
      The error is due to dereference a null pointer in function
      reset_one_sub_crq_queue():
      
      if (!scrq) {
          netdev_dbg(adapter->netdev,
                     "Invalid scrq reset. irq (%d) or msgs(%p).\n",
      		scrq->irq, scrq->msgs);
      		return -EINVAL;
      }
      
      If the expression is true, scrq must be a null pointer and cannot
      dereference.
      
      Fixes: 9281cf2d ("ibmvnic: avoid memset null scrq msgs")
      Signed-off-by: NYANG LI <abaci-bugfix@linux.alibaba.com>
      Reported-by: NAbaci <abaci@linux.alibaba.com>
      Acked-by: NLijun Pan <ljp@linux.ibm.com>
      Link: https://lore.kernel.org/r/1609312994-121032-1-git-send-email-abaci-bugfix@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      862aecbd
    • B
      docs: networking: packet_mmap: fix old config reference · e4da63cd
      Baruch Siach 提交于
      Before commit 889b8f96 ("packet: Kill CONFIG_PACKET_MMAP.") there
      used to be a CONFIG_PACKET_MMAP config symbol that depended on
      CONFIG_PACKET. The text still implies that PACKET_MMAP can be disabled.
      Remove that from the text, as well as reference to old kernel versions.
      
      Also, drop reference to broken link to information for pre 2.6.5
      kernels.
      
      Make a slight working improvement (s/In/On/) while at it.
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Link: https://lore.kernel.org/r/80089f3783372c8fd7833f28ce774a171b2ef252.1609232919.git.baruch@tkos.co.ilSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e4da63cd
    • B
      docs: networking: packet_mmap: fix formatting for C macros · 17e94567
      Baruch Siach 提交于
      The citation of macro definitions should appear in a code block.
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Link: https://lore.kernel.org/r/5cb47005e7a59b64299e038827e295822193384c.1609232919.git.baruch@tkos.co.ilSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      17e94567
    • Y
      vhost_net: fix ubuf refcount incorrectly when sendmsg fails · 01e31bea
      Yunjian Wang 提交于
      Currently the vhost_zerocopy_callback() maybe be called to decrease
      the refcount when sendmsg fails in tun. The error handling in vhost
      handle_tx_zerocopy() will try to decrease the same refcount again.
      This is wrong. To fix this issue, we only call vhost_net_ubuf_put()
      when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS.
      
      Fixes: bab632d6 ("vhost: vhost TX zero-copy support")
      Signed-off-by: NYunjian Wang <wangyunjian@huawei.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      01e31bea
    • T
      bareudp: Fix use of incorrect min_headroom size · 10ad3e99
      Taehee Yoo 提交于
      In the bareudp6_xmit_skb(), it calculates min_headroom.
      At that point, it uses struct iphdr, but it's not correct.
      So panic could occur.
      The struct ipv6hdr should be used.
      
      Test commands:
          ip netns add A
          ip netns add B
          ip link add veth0 netns A type veth peer name veth1 netns B
          ip netns exec A ip link set veth0 up
          ip netns exec A ip a a 2001:db8:0::1/64 dev veth0
          ip netns exec B ip link set veth1 up
          ip netns exec B ip a a 2001:db8:0::2/64 dev veth1
      
          for i in {10..1}
          do
                  let A=$i-1
                  ip netns exec A ip link add bareudp$i type bareudp dstport $i \
      		    ethertype 0x86dd
                  ip netns exec A ip link set bareudp$i up
                  ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i
                  ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \
      		    2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \
      		    dev bareudp$i
      
                  ip netns exec B ip link add bareudp$i type bareudp dstport $i \
      		    ethertype 0x86dd
                  ip netns exec B ip link set bareudp$i up
                  ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i
                  ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \
      		    2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \
      		    dev bareudp$i
          done
          ip netns exec A ping 2001:db8:7::2
      
      Splat looks like:
      [   66.436679][    C2] skbuff: skb_under_panic: text:ffffffff928614c8 len:454 put:14 head:ffff88810abb4000 data:ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0
      [   66.441626][    C2] ------------[ cut here ]------------
      [   66.443458][    C2] kernel BUG at net/core/skbuff.c:109!
      [   66.445313][    C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [   66.447606][    C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819
      [   66.450251][    C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [   66.453713][    C2] RIP: 0010:skb_panic+0x15d/0x15f
      [   66.455345][    C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff <0f> 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80
      [   66.462314][    C2] RSP: 0018:ffff888119209648 EFLAGS: 00010286
      [   66.464281][    C2] RAX: 0000000000000089 RBX: ffff888003159000 RCX: 0000000000000000
      [   66.467216][    C2] RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed10232412c0
      [   66.469768][    C2] RBP: ffff88810a53d440 R08: ffffed102328018d R09: ffffed102328018d
      [   66.472297][    C2] R10: ffff888119400c67 R11: ffffed102328018c R12: 000000000000000e
      [   66.474833][    C2] R13: ffff88810abb3ffa R14: 00000000000001c0 R15: 0000000000003ec0
      [   66.477361][    C2] FS:  00007f37c0c72f00(0000) GS:ffff888119200000(0000) knlGS:0000000000000000
      [   66.480214][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   66.482296][    C2] CR2: 000055a058808570 CR3: 000000011039e002 CR4: 00000000003706e0
      [   66.484811][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   66.487793][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   66.490424][    C2] Call Trace:
      [   66.491469][    C2]  <IRQ>
      [   66.492374][    C2]  ? eth_header+0x28/0x190
      [   66.494054][    C2]  ? eth_header+0x28/0x190
      [   66.495401][    C2]  skb_push.cold.99+0x22/0x22
      [   66.496700][    C2]  eth_header+0x28/0x190
      [   66.497867][    C2]  neigh_resolve_output+0x3de/0x720
      [   66.499615][    C2]  ? __neigh_update+0x7e8/0x20a0
      [   66.501176][    C2]  __neigh_update+0x8bd/0x20a0
      [   66.502749][    C2]  ndisc_update+0x34/0xc0
      [   66.504010][    C2]  ndisc_recv_na+0x8da/0xb80
      [   66.505041][    C2]  ? pndisc_redo+0x20/0x20
      [   66.505888][    C2]  ? rcu_read_lock_sched_held+0xc0/0xc0
      [   66.506965][    C2]  ndisc_rcv+0x3a0/0x470
      [   66.507797][    C2]  icmpv6_rcv+0xad9/0x1b00
      [   66.508645][    C2]  ip6_protocol_deliver_rcu+0xcd6/0x1560
      [   66.509719][    C2]  ip6_input_finish+0x5b/0xf0
      [   66.510615][    C2]  ip6_input+0xcd/0x2d0
      [   66.511406][    C2]  ? ip6_input_finish+0xf0/0xf0
      [   66.512327][    C2]  ? rcu_read_lock_held+0x91/0xa0
      [   66.513279][    C2]  ? ip6_protocol_deliver_rcu+0x1560/0x1560
      [   66.514414][    C2]  ipv6_rcv+0xe8/0x300
      [ ... ]
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      10ad3e99
    • T
      bareudp: set NETIF_F_LLTX flag · d9e44981
      Taehee Yoo 提交于
      Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
      So, It is good to set the NETIF_F_LLTX flag to improve performance and
      to avoid lockdep's false-positive warning.
      
      Test commands:
          ip netns add A
          ip netns add B
          ip link add veth0 netns A type veth peer name veth1 netns B
          ip netns exec A ip link set veth0 up
          ip netns exec A ip a a 10.0.0.1/24 dev veth0
          ip netns exec B ip link set veth1 up
          ip netns exec B ip a a 10.0.0.2/24 dev veth1
      
          for i in {2..1}
          do
                  let A=$i-1
                  ip netns exec A ip link add bareudp$i type bareudp \
      		    dstport $i ethertype ip
                  ip netns exec A ip link set bareudp$i up
                  ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
                  ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
      		    dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i
      
                  ip netns exec B ip link add bareudp$i type bareudp \
      		    dstport $i ethertype ip
                  ip netns exec B ip link set bareudp$i up
                  ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
                  ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
      		    dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
          done
          ip netns exec A ping 10.0.2.2
      
      Splat looks like:
      [   96.992803][  T822] ============================================
      [   96.993954][  T822] WARNING: possible recursive locking detected
      [   96.995102][  T822] 5.10.0+ #819 Not tainted
      [   96.995927][  T822] --------------------------------------------
      [   96.997091][  T822] ping/822 is trying to acquire lock:
      [   96.998083][  T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
      [   96.999813][  T822]
      [   96.999813][  T822] but task is already holding lock:
      [   97.001192][  T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
      [   97.002908][  T822]
      [   97.002908][  T822] other info that might help us debug this:
      [   97.004401][  T822]  Possible unsafe locking scenario:
      [   97.004401][  T822]
      [   97.005784][  T822]        CPU0
      [   97.006407][  T822]        ----
      [   97.007010][  T822]   lock(_xmit_NONE#2);
      [   97.007779][  T822]   lock(_xmit_NONE#2);
      [   97.008550][  T822]
      [   97.008550][  T822]  *** DEADLOCK ***
      [   97.008550][  T822]
      [   97.010057][  T822]  May be due to missing lock nesting notation
      [   97.010057][  T822]
      [   97.011594][  T822] 7 locks held by ping/822:
      [   97.012426][  T822]  #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
      [   97.014191][  T822]  #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
      [   97.016045][  T822]  #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
      [   97.017897][  T822]  #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
      [   97.019684][  T822]  #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
      [   97.021573][  T822]  #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
      [   97.023424][  T822]  #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
      [   97.025259][  T822]
      [   97.025259][  T822] stack backtrace:
      [   97.026349][  T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
      [   97.027609][  T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [   97.029407][  T822] Call Trace:
      [   97.030015][  T822]  dump_stack+0x99/0xcb
      [   97.030783][  T822]  __lock_acquire.cold.77+0x149/0x3a9
      [   97.031773][  T822]  ? stack_trace_save+0x81/0xa0
      [   97.032661][  T822]  ? register_lock_class+0x1910/0x1910
      [   97.033673][  T822]  ? register_lock_class+0x1910/0x1910
      [   97.034679][  T822]  ? rcu_read_lock_sched_held+0x91/0xc0
      [   97.035697][  T822]  ? rcu_read_lock_bh_held+0xa0/0xa0
      [   97.036690][  T822]  lock_acquire+0x1b2/0x730
      [   97.037515][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
      [   97.038466][  T822]  ? check_flags+0x50/0x50
      [   97.039277][  T822]  ? netif_skb_features+0x296/0x9c0
      [   97.040226][  T822]  ? validate_xmit_skb+0x29/0xb10
      [   97.041151][  T822]  _raw_spin_lock+0x30/0x70
      [   97.041977][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
      [   97.042927][  T822]  __dev_queue_xmit+0x1f52/0x2960
      [   97.043852][  T822]  ? netdev_core_pick_tx+0x290/0x290
      [   97.044824][  T822]  ? mark_held_locks+0xb7/0x120
      [   97.045712][  T822]  ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
      [   97.046824][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
      [   97.047771][  T822]  ? ___neigh_create+0x12a8/0x1eb0
      [   97.048710][  T822]  ? trace_hardirqs_on+0x41/0x120
      [   97.049626][  T822]  ? ___neigh_create+0x12a8/0x1eb0
      [   97.050556][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
      [   97.051509][  T822]  ? ___neigh_create+0x12a8/0x1eb0
      [   97.052443][  T822]  ? check_chain_key+0x244/0x5f0
      [   97.053352][  T822]  ? rcu_read_lock_bh_held+0x56/0xa0
      [   97.054317][  T822]  ? ip_finish_output2+0x6ea/0x2020
      [   97.055263][  T822]  ? pneigh_lookup+0x410/0x410
      [   97.056135][  T822]  ip_finish_output2+0x6ea/0x2020
      [ ... ]
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d9e44981