1. 05 5月, 2020 2 次提交
    • C
      net: partially revert dynamic lockdep key changes · 1a33e10e
      Cong Wang 提交于
      This patch reverts the folowing commits:
      
      commit 064ff66e
      "bonding: add missing netdev_update_lockdep_key()"
      
      commit 53d37497
      "net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()"
      
      commit 1f26c0d3
      "net: fix kernel-doc warning in <linux/netdevice.h>"
      
      commit ab92d68f
      "net: core: add generic lockdep keys"
      
      but keeps the addr_list_lock_key because we still lock
      addr_list_lock nestedly on stack devices, unlikely xmit_lock
      this is safe because we don't take addr_list_lock on any fast
      path.
      
      Reported-and-tested-by: syzbot+aaa6fa4949cc5d9b7b25@syzkaller.appspotmail.com
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Taehee Yoo <ap420073@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a33e10e
    • J
      net: sched: fallback to qdisc noqueue if default qdisc setup fail · bf6dba76
      Jesper Dangaard Brouer 提交于
      Currently if the default qdisc setup/init fails, the device ends up with
      qdisc "noop", which causes all TX packets to get dropped.
      
      With the introduction of sysctl net/core/default_qdisc it is possible
      to change the default qdisc to be more advanced, which opens for the
      possibility that Qdisc_ops->init() can fail.
      
      This patch detect these kind of failures, and choose to fallback to
      qdisc "noqueue", which is so simple that its init call will not fail.
      This allows the interface to continue functioning.
      
      V2:
      As this also captures memory failures, which are transient, the
      device is not kept in IFF_NO_QUEUE state.  This allows the net_device
      to retry to default qdisc assignment.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf6dba76
  2. 13 3月, 2020 1 次提交
    • J
      Revert "net: sched: make newly activated qdiscs visible" · 7c4046b1
      Julian Wiedmann 提交于
      This reverts commit 4cda7527
      from net-next.
      
      Brown bag time.
      
      Michal noticed that this change doesn't work at all when
      netif_set_real_num_tx_queues() gets called prior to an initial
      dev_activate(), as for instance igb does.
      
      Doing so dies with:
      
      [   40.579142] BUG: kernel NULL pointer dereference, address: 0000000000000400
      [   40.586922] #PF: supervisor read access in kernel mode
      [   40.592668] #PF: error_code(0x0000) - not-present page
      [   40.598405] PGD 0 P4D 0
      [   40.601234] Oops: 0000 [#1] PREEMPT SMP PTI
      [   40.605909] CPU: 18 PID: 1681 Comm: wickedd Tainted: G            E     5.6.0-rc3-ethnl.50-default #1
      [   40.616205] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
      [   40.627377] RIP: 0010:qdisc_hash_add.part.22+0x2e/0x90
      [   40.633115] Code: 00 55 53 89 f5 48 89 fb e8 2f 9b fb ff 85 c0 74 44 48 8b 43 40 48 8b 08 69 43 38 47 86 c8 61 c1 e8 1c 48 83 e8 80 48 8d 14 c1 <48> 8b 04 c1 48 8d 4b 28 48 89 53 30 48 89 43 28 48 85 c0 48 89 0a
      [   40.654080] RSP: 0018:ffffb879864934d8 EFLAGS: 00010203
      [   40.659914] RAX: 0000000000000080 RBX: ffffffffb8328d80 RCX: 0000000000000000
      [   40.667882] RDX: 0000000000000400 RSI: 0000000000000000 RDI: ffffffffb831faa0
      [   40.675849] RBP: 0000000000000000 R08: ffffa0752c8b9088 R09: ffffa0752c8b9208
      [   40.683816] R10: 0000000000000006 R11: 0000000000000000 R12: ffffa0752d734000
      [   40.691783] R13: 0000000000000008 R14: 0000000000000000 R15: ffffa07113c18000
      [   40.699750] FS:  00007f94548e5880(0000) GS:ffffa0752e980000(0000) knlGS:0000000000000000
      [   40.708782] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   40.715189] CR2: 0000000000000400 CR3: 000000082b6ae006 CR4: 00000000001606e0
      [   40.723156] Call Trace:
      [   40.725888]  dev_qdisc_set_real_num_tx_queues+0x61/0x90
      [   40.731725]  netif_set_real_num_tx_queues+0x94/0x1d0
      [   40.737286]  __igb_open+0x19a/0x5d0 [igb]
      [   40.741767]  __dev_open+0xbb/0x150
      [   40.745567]  __dev_change_flags+0x157/0x1a0
      [   40.750240]  dev_change_flags+0x23/0x60
      
      [...]
      
      Fixes: 4cda7527 ("net: sched: make newly activated qdiscs visible")
      Reported-by: NMichal Kubecek <mkubecek@suse.cz>
      CC: Michal Kubecek <mkubecek@suse.cz>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Cong Wang <xiyou.wangcong@gmail.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c4046b1
  3. 12 3月, 2020 1 次提交
    • J
      net: sched: make newly activated qdiscs visible · 4cda7527
      Julian Wiedmann 提交于
      In their .attach callback, mq[prio] only add the qdiscs of the currently
      active TX queues to the device's qdisc hash list.
      If a user later increases the number of active TX queues, their qdiscs
      are not visible via eg. 'tc qdisc show'.
      
      Add a hook to netif_set_real_num_tx_queues() that walks all active
      TX queues and adds those which are missing to the hash list.
      
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Cong Wang <xiyou.wangcong@gmail.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cda7527
  4. 20 2月, 2020 1 次提交
  5. 13 12月, 2019 1 次提交
    • M
      netdev: pass the stuck queue to the timeout handler · 0290bd29
      Michael S. Tsirkin 提交于
      This allows incrementing the correct timeout statistic without any mess.
      Down the road, devices can learn to reset just the specific queue.
      
      The patch was generated with the following script:
      
      use strict;
      use warnings;
      
      our $^I = '.bak';
      
      my @work = (
      ["arch/m68k/emu/nfeth.c", "nfeth_tx_timeout"],
      ["arch/um/drivers/net_kern.c", "uml_net_tx_timeout"],
      ["arch/um/drivers/vector_kern.c", "vector_net_tx_timeout"],
      ["arch/xtensa/platforms/iss/network.c", "iss_net_tx_timeout"],
      ["drivers/char/pcmcia/synclink_cs.c", "hdlcdev_tx_timeout"],
      ["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"],
      ["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"],
      ["drivers/message/fusion/mptlan.c", "mpt_lan_tx_timeout"],
      ["drivers/misc/sgi-xp/xpnet.c", "xpnet_dev_tx_timeout"],
      ["drivers/net/appletalk/cops.c", "cops_timeout"],
      ["drivers/net/arcnet/arcdevice.h", "arcnet_timeout"],
      ["drivers/net/arcnet/arcnet.c", "arcnet_timeout"],
      ["drivers/net/arcnet/com20020.c", "arcnet_timeout"],
      ["drivers/net/ethernet/3com/3c509.c", "el3_tx_timeout"],
      ["drivers/net/ethernet/3com/3c515.c", "corkscrew_timeout"],
      ["drivers/net/ethernet/3com/3c574_cs.c", "el3_tx_timeout"],
      ["drivers/net/ethernet/3com/3c589_cs.c", "el3_tx_timeout"],
      ["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"],
      ["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"],
      ["drivers/net/ethernet/3com/typhoon.c", "typhoon_tx_timeout"],
      ["drivers/net/ethernet/8390/8390.h", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/8390.h", "eip_tx_timeout"],
      ["drivers/net/ethernet/8390/8390.c", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/8390p.c", "eip_tx_timeout"],
      ["drivers/net/ethernet/8390/ax88796.c", "ax_ei_tx_timeout"],
      ["drivers/net/ethernet/8390/axnet_cs.c", "axnet_tx_timeout"],
      ["drivers/net/ethernet/8390/etherh.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/8390/hydra.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/8390/mac8390.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/8390/mcf8390.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/8390/lib8390.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/8390/ne2k-pci.c", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/pcnet_cs.c", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/smc-ultra.c", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/wd.c", "ei_tx_timeout"],
      ["drivers/net/ethernet/8390/zorro8390.c", "__ei_tx_timeout"],
      ["drivers/net/ethernet/adaptec/starfire.c", "tx_timeout"],
      ["drivers/net/ethernet/agere/et131x.c", "et131x_tx_timeout"],
      ["drivers/net/ethernet/allwinner/sun4i-emac.c", "emac_timeout"],
      ["drivers/net/ethernet/alteon/acenic.c", "ace_watchdog"],
      ["drivers/net/ethernet/amazon/ena/ena_netdev.c", "ena_tx_timeout"],
      ["drivers/net/ethernet/amd/7990.h", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/7990.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/a2065.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/am79c961a.c", "am79c961_timeout"],
      ["drivers/net/ethernet/amd/amd8111e.c", "amd8111e_tx_timeout"],
      ["drivers/net/ethernet/amd/ariadne.c", "ariadne_tx_timeout"],
      ["drivers/net/ethernet/amd/atarilance.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/au1000_eth.c", "au1000_tx_timeout"],
      ["drivers/net/ethernet/amd/declance.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/lance.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/mvme147.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/ni65.c", "ni65_timeout"],
      ["drivers/net/ethernet/amd/nmclan_cs.c", "mace_tx_timeout"],
      ["drivers/net/ethernet/amd/pcnet32.c", "pcnet32_tx_timeout"],
      ["drivers/net/ethernet/amd/sunlance.c", "lance_tx_timeout"],
      ["drivers/net/ethernet/amd/xgbe/xgbe-drv.c", "xgbe_tx_timeout"],
      ["drivers/net/ethernet/apm/xgene-v2/main.c", "xge_timeout"],
      ["drivers/net/ethernet/apm/xgene/xgene_enet_main.c", "xgene_enet_timeout"],
      ["drivers/net/ethernet/apple/macmace.c", "mace_tx_timeout"],
      ["drivers/net/ethernet/atheros/ag71xx.c", "ag71xx_tx_timeout"],
      ["drivers/net/ethernet/atheros/alx/main.c", "alx_tx_timeout"],
      ["drivers/net/ethernet/atheros/atl1c/atl1c_main.c", "atl1c_tx_timeout"],
      ["drivers/net/ethernet/atheros/atl1e/atl1e_main.c", "atl1e_tx_timeout"],
      ["drivers/net/ethernet/atheros/atlx/atl.c", "atlx_tx_timeout"],
      ["drivers/net/ethernet/atheros/atlx/atl1.c", "atlx_tx_timeout"],
      ["drivers/net/ethernet/atheros/atlx/atl2.c", "atl2_tx_timeout"],
      ["drivers/net/ethernet/broadcom/b44.c", "b44_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bcmsysport.c", "bcm_sysport_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bnx2.c", "bnx2_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h", "bnx2x_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c", "bnx2x_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c", "bnx2x_tx_timeout"],
      ["drivers/net/ethernet/broadcom/bnxt/bnxt.c", "bnxt_tx_timeout"],
      ["drivers/net/ethernet/broadcom/genet/bcmgenet.c", "bcmgenet_timeout"],
      ["drivers/net/ethernet/broadcom/sb1250-mac.c", "sbmac_tx_timeout"],
      ["drivers/net/ethernet/broadcom/tg3.c", "tg3_tx_timeout"],
      ["drivers/net/ethernet/calxeda/xgmac.c", "xgmac_tx_timeout"],
      ["drivers/net/ethernet/cavium/liquidio/lio_main.c", "liquidio_tx_timeout"],
      ["drivers/net/ethernet/cavium/liquidio/lio_vf_main.c", "liquidio_tx_timeout"],
      ["drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c", "lio_vf_rep_tx_timeout"],
      ["drivers/net/ethernet/cavium/thunder/nicvf_main.c", "nicvf_tx_timeout"],
      ["drivers/net/ethernet/cirrus/cs89x0.c", "net_timeout"],
      ["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"],
      ["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"],
      ["drivers/net/ethernet/cortina/gemini.c", "gmac_tx_timeout"],
      ["drivers/net/ethernet/davicom/dm9000.c", "dm9000_timeout"],
      ["drivers/net/ethernet/dec/tulip/de2104x.c", "de_tx_timeout"],
      ["drivers/net/ethernet/dec/tulip/tulip_core.c", "tulip_tx_timeout"],
      ["drivers/net/ethernet/dec/tulip/winbond-840.c", "tx_timeout"],
      ["drivers/net/ethernet/dlink/dl2k.c", "rio_tx_timeout"],
      ["drivers/net/ethernet/dlink/sundance.c", "tx_timeout"],
      ["drivers/net/ethernet/emulex/benet/be_main.c", "be_tx_timeout"],
      ["drivers/net/ethernet/ethoc.c", "ethoc_tx_timeout"],
      ["drivers/net/ethernet/faraday/ftgmac100.c", "ftgmac100_tx_timeout"],
      ["drivers/net/ethernet/fealnx.c", "fealnx_tx_timeout"],
      ["drivers/net/ethernet/freescale/dpaa/dpaa_eth.c", "dpaa_tx_timeout"],
      ["drivers/net/ethernet/freescale/fec_main.c", "fec_timeout"],
      ["drivers/net/ethernet/freescale/fec_mpc52xx.c", "mpc52xx_fec_tx_timeout"],
      ["drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c", "fs_timeout"],
      ["drivers/net/ethernet/freescale/gianfar.c", "gfar_timeout"],
      ["drivers/net/ethernet/freescale/ucc_geth.c", "ucc_geth_timeout"],
      ["drivers/net/ethernet/fujitsu/fmvj18x_cs.c", "fjn_tx_timeout"],
      ["drivers/net/ethernet/google/gve/gve_main.c", "gve_tx_timeout"],
      ["drivers/net/ethernet/hisilicon/hip04_eth.c", "hip04_timeout"],
      ["drivers/net/ethernet/hisilicon/hix5hd2_gmac.c", "hix5hd2_net_timeout"],
      ["drivers/net/ethernet/hisilicon/hns/hns_enet.c", "hns_nic_net_timeout"],
      ["drivers/net/ethernet/hisilicon/hns3/hns3_enet.c", "hns3_nic_net_timeout"],
      ["drivers/net/ethernet/huawei/hinic/hinic_main.c", "hinic_tx_timeout"],
      ["drivers/net/ethernet/i825xx/82596.c", "i596_tx_timeout"],
      ["drivers/net/ethernet/i825xx/ether1.c", "ether1_timeout"],
      ["drivers/net/ethernet/i825xx/lib82596.c", "i596_tx_timeout"],
      ["drivers/net/ethernet/i825xx/sun3_82586.c", "sun3_82586_timeout"],
      ["drivers/net/ethernet/ibm/ehea/ehea_main.c", "ehea_tx_watchdog"],
      ["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"],
      ["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"],
      ["drivers/net/ethernet/ibm/ibmvnic.c", "ibmvnic_tx_timeout"],
      ["drivers/net/ethernet/intel/e100.c", "e100_tx_timeout"],
      ["drivers/net/ethernet/intel/e1000/e1000_main.c", "e1000_tx_timeout"],
      ["drivers/net/ethernet/intel/e1000e/netdev.c", "e1000_tx_timeout"],
      ["drivers/net/ethernet/intel/fm10k/fm10k_netdev.c", "fm10k_tx_timeout"],
      ["drivers/net/ethernet/intel/i40e/i40e_main.c", "i40e_tx_timeout"],
      ["drivers/net/ethernet/intel/iavf/iavf_main.c", "iavf_tx_timeout"],
      ["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"],
      ["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"],
      ["drivers/net/ethernet/intel/igb/igb_main.c", "igb_tx_timeout"],
      ["drivers/net/ethernet/intel/igbvf/netdev.c", "igbvf_tx_timeout"],
      ["drivers/net/ethernet/intel/ixgb/ixgb_main.c", "ixgb_tx_timeout"],
      ["drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c", "adapter->netdev->netdev_ops->ndo_tx_timeout(adapter->netdev);"],
      ["drivers/net/ethernet/intel/ixgbe/ixgbe_main.c", "ixgbe_tx_timeout"],
      ["drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c", "ixgbevf_tx_timeout"],
      ["drivers/net/ethernet/jme.c", "jme_tx_timeout"],
      ["drivers/net/ethernet/korina.c", "korina_tx_timeout"],
      ["drivers/net/ethernet/lantiq_etop.c", "ltq_etop_tx_timeout"],
      ["drivers/net/ethernet/marvell/mv643xx_eth.c", "mv643xx_eth_tx_timeout"],
      ["drivers/net/ethernet/marvell/pxa168_eth.c", "pxa168_eth_tx_timeout"],
      ["drivers/net/ethernet/marvell/skge.c", "skge_tx_timeout"],
      ["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"],
      ["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"],
      ["drivers/net/ethernet/mediatek/mtk_eth_soc.c", "mtk_tx_timeout"],
      ["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"],
      ["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"],
      ["drivers/net/ethernet/mellanox/mlx5/core/en_main.c", "mlx5e_tx_timeout"],
      ["drivers/net/ethernet/micrel/ks8842.c", "ks8842_tx_timeout"],
      ["drivers/net/ethernet/micrel/ksz884x.c", "netdev_tx_timeout"],
      ["drivers/net/ethernet/microchip/enc28j60.c", "enc28j60_tx_timeout"],
      ["drivers/net/ethernet/microchip/encx24j600.c", "encx24j600_tx_timeout"],
      ["drivers/net/ethernet/natsemi/sonic.h", "sonic_tx_timeout"],
      ["drivers/net/ethernet/natsemi/sonic.c", "sonic_tx_timeout"],
      ["drivers/net/ethernet/natsemi/jazzsonic.c", "sonic_tx_timeout"],
      ["drivers/net/ethernet/natsemi/macsonic.c", "sonic_tx_timeout"],
      ["drivers/net/ethernet/natsemi/natsemi.c", "ns_tx_timeout"],
      ["drivers/net/ethernet/natsemi/ns83820.c", "ns83820_tx_timeout"],
      ["drivers/net/ethernet/natsemi/xtsonic.c", "sonic_tx_timeout"],
      ["drivers/net/ethernet/neterion/s2io.h", "s2io_tx_watchdog"],
      ["drivers/net/ethernet/neterion/s2io.c", "s2io_tx_watchdog"],
      ["drivers/net/ethernet/neterion/vxge/vxge-main.c", "vxge_tx_watchdog"],
      ["drivers/net/ethernet/netronome/nfp/nfp_net_common.c", "nfp_net_tx_timeout"],
      ["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"],
      ["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"],
      ["drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c", "pch_gbe_tx_timeout"],
      ["drivers/net/ethernet/packetengines/hamachi.c", "hamachi_tx_timeout"],
      ["drivers/net/ethernet/packetengines/yellowfin.c", "yellowfin_tx_timeout"],
      ["drivers/net/ethernet/pensando/ionic/ionic_lif.c", "ionic_tx_timeout"],
      ["drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c", "netxen_tx_timeout"],
      ["drivers/net/ethernet/qlogic/qla3xxx.c", "ql3xxx_tx_timeout"],
      ["drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c", "qlcnic_tx_timeout"],
      ["drivers/net/ethernet/qualcomm/emac/emac.c", "emac_tx_timeout"],
      ["drivers/net/ethernet/qualcomm/qca_spi.c", "qcaspi_netdev_tx_timeout"],
      ["drivers/net/ethernet/qualcomm/qca_uart.c", "qcauart_netdev_tx_timeout"],
      ["drivers/net/ethernet/rdc/r6040.c", "r6040_tx_timeout"],
      ["drivers/net/ethernet/realtek/8139cp.c", "cp_tx_timeout"],
      ["drivers/net/ethernet/realtek/8139too.c", "rtl8139_tx_timeout"],
      ["drivers/net/ethernet/realtek/atp.c", "tx_timeout"],
      ["drivers/net/ethernet/realtek/r8169_main.c", "rtl8169_tx_timeout"],
      ["drivers/net/ethernet/renesas/ravb_main.c", "ravb_tx_timeout"],
      ["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"],
      ["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"],
      ["drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c", "sxgbe_tx_timeout"],
      ["drivers/net/ethernet/seeq/ether3.c", "ether3_timeout"],
      ["drivers/net/ethernet/seeq/sgiseeq.c", "timeout"],
      ["drivers/net/ethernet/sfc/efx.c", "efx_watchdog"],
      ["drivers/net/ethernet/sfc/falcon/efx.c", "ef4_watchdog"],
      ["drivers/net/ethernet/sgi/ioc3-eth.c", "ioc3_timeout"],
      ["drivers/net/ethernet/sgi/meth.c", "meth_tx_timeout"],
      ["drivers/net/ethernet/silan/sc92031.c", "sc92031_tx_timeout"],
      ["drivers/net/ethernet/sis/sis190.c", "sis190_tx_timeout"],
      ["drivers/net/ethernet/sis/sis900.c", "sis900_tx_timeout"],
      ["drivers/net/ethernet/smsc/epic100.c", "epic_tx_timeout"],
      ["drivers/net/ethernet/smsc/smc911x.c", "smc911x_timeout"],
      ["drivers/net/ethernet/smsc/smc9194.c", "smc_timeout"],
      ["drivers/net/ethernet/smsc/smc91c92_cs.c", "smc_tx_timeout"],
      ["drivers/net/ethernet/smsc/smc91x.c", "smc_timeout"],
      ["drivers/net/ethernet/stmicro/stmmac/stmmac_main.c", "stmmac_tx_timeout"],
      ["drivers/net/ethernet/sun/cassini.c", "cas_tx_timeout"],
      ["drivers/net/ethernet/sun/ldmvsw.c", "sunvnet_tx_timeout_common"],
      ["drivers/net/ethernet/sun/niu.c", "niu_tx_timeout"],
      ["drivers/net/ethernet/sun/sunbmac.c", "bigmac_tx_timeout"],
      ["drivers/net/ethernet/sun/sungem.c", "gem_tx_timeout"],
      ["drivers/net/ethernet/sun/sunhme.c", "happy_meal_tx_timeout"],
      ["drivers/net/ethernet/sun/sunqe.c", "qe_tx_timeout"],
      ["drivers/net/ethernet/sun/sunvnet.c", "sunvnet_tx_timeout_common"],
      ["drivers/net/ethernet/sun/sunvnet_common.c", "sunvnet_tx_timeout_common"],
      ["drivers/net/ethernet/sun/sunvnet_common.h", "sunvnet_tx_timeout_common"],
      ["drivers/net/ethernet/synopsys/dwc-xlgmac-net.c", "xlgmac_tx_timeout"],
      ["drivers/net/ethernet/ti/cpmac.c", "cpmac_tx_timeout"],
      ["drivers/net/ethernet/ti/cpsw.c", "cpsw_ndo_tx_timeout"],
      ["drivers/net/ethernet/ti/cpsw_priv.c", "cpsw_ndo_tx_timeout"],
      ["drivers/net/ethernet/ti/cpsw_priv.h", "cpsw_ndo_tx_timeout"],
      ["drivers/net/ethernet/ti/davinci_emac.c", "emac_dev_tx_timeout"],
      ["drivers/net/ethernet/ti/netcp_core.c", "netcp_ndo_tx_timeout"],
      ["drivers/net/ethernet/ti/tlan.c", "tlan_tx_timeout"],
      ["drivers/net/ethernet/toshiba/ps3_gelic_net.h", "gelic_net_tx_timeout"],
      ["drivers/net/ethernet/toshiba/ps3_gelic_net.c", "gelic_net_tx_timeout"],
      ["drivers/net/ethernet/toshiba/ps3_gelic_wireless.c", "gelic_net_tx_timeout"],
      ["drivers/net/ethernet/toshiba/spider_net.c", "spider_net_tx_timeout"],
      ["drivers/net/ethernet/toshiba/tc35815.c", "tc35815_tx_timeout"],
      ["drivers/net/ethernet/via/via-rhine.c", "rhine_tx_timeout"],
      ["drivers/net/ethernet/wiznet/w5100.c", "w5100_tx_timeout"],
      ["drivers/net/ethernet/wiznet/w5300.c", "w5300_tx_timeout"],
      ["drivers/net/ethernet/xilinx/xilinx_emaclite.c", "xemaclite_tx_timeout"],
      ["drivers/net/ethernet/xircom/xirc2ps_cs.c", "xirc_tx_timeout"],
      ["drivers/net/fjes/fjes_main.c", "fjes_tx_retry"],
      ["drivers/net/slip/slip.c", "sl_tx_timeout"],
      ["include/linux/usb/usbnet.h", "usbnet_tx_timeout"],
      ["drivers/net/usb/aqc111.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/ax88172a.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/ax88179_178a.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/catc.c", "catc_tx_timeout"],
      ["drivers/net/usb/cdc_mbim.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/cdc_ncm.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/dm9601.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/hso.c", "hso_net_tx_timeout"],
      ["drivers/net/usb/int51x1.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/ipheth.c", "ipheth_tx_timeout"],
      ["drivers/net/usb/kaweth.c", "kaweth_tx_timeout"],
      ["drivers/net/usb/lan78xx.c", "lan78xx_tx_timeout"],
      ["drivers/net/usb/mcs7830.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/pegasus.c", "pegasus_tx_timeout"],
      ["drivers/net/usb/qmi_wwan.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/r8152.c", "rtl8152_tx_timeout"],
      ["drivers/net/usb/rndis_host.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/rtl8150.c", "rtl8150_tx_timeout"],
      ["drivers/net/usb/sierra_net.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/smsc75xx.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/smsc95xx.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/sr9700.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/sr9800.c", "usbnet_tx_timeout"],
      ["drivers/net/usb/usbnet.c", "usbnet_tx_timeout"],
      ["drivers/net/vmxnet3/vmxnet3_drv.c", "vmxnet3_tx_timeout"],
      ["drivers/net/wan/cosa.c", "cosa_net_timeout"],
      ["drivers/net/wan/farsync.c", "fst_tx_timeout"],
      ["drivers/net/wan/fsl_ucc_hdlc.c", "uhdlc_tx_timeout"],
      ["drivers/net/wan/lmc/lmc_main.c", "lmc_driver_timeout"],
      ["drivers/net/wan/x25_asy.c", "x25_asy_timeout"],
      ["drivers/net/wimax/i2400m/netdev.c", "i2400m_tx_timeout"],
      ["drivers/net/wireless/intel/ipw2x00/ipw2100.c", "ipw2100_tx_timeout"],
      ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
      ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
      ["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
      ["drivers/net/wireless/intersil/orinoco/main.c", "orinoco_tx_timeout"],
      ["drivers/net/wireless/intersil/orinoco/orinoco_usb.c", "orinoco_tx_timeout"],
      ["drivers/net/wireless/intersil/orinoco/orinoco.h", "orinoco_tx_timeout"],
      ["drivers/net/wireless/intersil/prism54/islpci_dev.c", "islpci_eth_tx_timeout"],
      ["drivers/net/wireless/intersil/prism54/islpci_eth.c", "islpci_eth_tx_timeout"],
      ["drivers/net/wireless/intersil/prism54/islpci_eth.h", "islpci_eth_tx_timeout"],
      ["drivers/net/wireless/marvell/mwifiex/main.c", "mwifiex_tx_timeout"],
      ["drivers/net/wireless/quantenna/qtnfmac/core.c", "qtnf_netdev_tx_timeout"],
      ["drivers/net/wireless/quantenna/qtnfmac/core.h", "qtnf_netdev_tx_timeout"],
      ["drivers/net/wireless/rndis_wlan.c", "usbnet_tx_timeout"],
      ["drivers/net/wireless/wl3501_cs.c", "wl3501_tx_timeout"],
      ["drivers/net/wireless/zydas/zd1201.c", "zd1201_tx_timeout"],
      ["drivers/s390/net/qeth_core.h", "qeth_tx_timeout"],
      ["drivers/s390/net/qeth_core_main.c", "qeth_tx_timeout"],
      ["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"],
      ["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"],
      ["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"],
      ["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"],
      ["drivers/staging/ks7010/ks_wlan_net.c", "ks_wlan_tx_timeout"],
      ["drivers/staging/qlge/qlge_main.c", "qlge_tx_timeout"],
      ["drivers/staging/rtl8192e/rtl8192e/rtl_core.c", "_rtl92e_tx_timeout"],
      ["drivers/staging/rtl8192u/r8192U_core.c", "tx_timeout"],
      ["drivers/staging/unisys/visornic/visornic_main.c", "visornic_xmit_timeout"],
      ["drivers/staging/wlan-ng/p80211netdev.c", "p80211knetdev_tx_timeout"],
      ["drivers/tty/n_gsm.c", "gsm_mux_net_tx_timeout"],
      ["drivers/tty/synclink.c", "hdlcdev_tx_timeout"],
      ["drivers/tty/synclink_gt.c", "hdlcdev_tx_timeout"],
      ["drivers/tty/synclinkmp.c", "hdlcdev_tx_timeout"],
      ["net/atm/lec.c", "lec_tx_timeout"],
      ["net/bluetooth/bnep/netdev.c", "bnep_net_timeout"]
      );
      
      for my $p (@work) {
      	my @pair = @$p;
      	my $file = $pair[0];
      	my $func = $pair[1];
      	print STDERR $file , ": ", $func,"\n";
      	our @ARGV = ($file);
      	while (<ARGV>) {
      		if (m/($func\s*\(struct\s+net_device\s+\*[A-Za-z_]?[A-Za-z-0-9_]*)(\))/) {
      			print STDERR "found $1+$2 in $file\n";
      		}
      		if (s/($func\s*\(struct\s+net_device\s+\*[A-Za-z_]?[A-Za-z-0-9_]*)(\))/$1, unsigned int txqueue$2/) {
      			print STDERR "$func found in $file\n";
      		}
      		print;
      	}
      }
      
      where the list of files and functions is simply from:
      
      git grep ndo_tx_timeout, with manual addition of headers
      in the rare cases where the function is from a header,
      then manually changing the few places which actually
      call ndo_tx_timeout.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NShannon Nelson <snelson@pensando.io>
      Reviewed-by: NMartin Habets <mhabets@solarflare.com>
      
      changes from v9:
      	fixup a forward declaration
      changes from v9:
      	more leftovers from v3 change
      changes from v8:
              fix up a missing direct call to timeout
              rebased on net-next
      changes from v7:
      	fixup leftovers from v3 change
      changes from v6:
      	fix typo in rtl driver
      changes from v5:
      	add missing files (allow any net device argument name)
      changes from v4:
      	add a missing driver header
      changes from v3:
              change queue # to unsigned
      Changes from v2:
              added headers
      Changes from v1:
              Fix errors found by kbuild:
              generalize the pattern a bit, to pick up
              a couple of instances missed by the previous
              version.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0290bd29
  6. 09 11月, 2019 1 次提交
    • E
      net/sched: annotate lockless accesses to qdisc->empty · 90b2be27
      Eric Dumazet 提交于
      KCSAN reported the following race [1]
      
      BUG: KCSAN: data-race in __dev_queue_xmit / net_tx_action
      
      read to 0xffff8880ba403508 of 1 bytes by task 21814 on cpu 1:
       __dev_xmit_skb net/core/dev.c:3389 [inline]
       __dev_queue_xmit+0x9db/0x1b40 net/core/dev.c:3761
       dev_queue_xmit+0x21/0x30 net/core/dev.c:3825
       neigh_hh_output include/net/neighbour.h:500 [inline]
       neigh_output include/net/neighbour.h:509 [inline]
       ip6_finish_output2+0x873/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       ip6_local_out+0x74/0x90 net/ipv6/output_core.c:179
       ip6_send_skb+0x53/0x110 net/ipv6/ip6_output.c:1795
       udp_v6_send_skb.isra.0+0x3ec/0xa70 net/ipv6/udp.c:1173
       udpv6_sendmsg+0x1906/0x1c20 net/ipv6/udp.c:1471
       inet6_sendmsg+0x6d/0x90 net/ipv6/af_inet6.c:576
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0x9f/0xc0 net/socket.c:657
       ___sys_sendmsg+0x2b7/0x5d0 net/socket.c:2311
       __sys_sendmmsg+0x123/0x350 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x64/0x80 net/socket.c:2439
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      write to 0xffff8880ba403508 of 1 bytes by interrupt on cpu 0:
       qdisc_run_begin include/net/sch_generic.h:160 [inline]
       qdisc_run include/net/pkt_sched.h:120 [inline]
       net_tx_action+0x2b1/0x6c0 net/core/dev.c:4551
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
       do_softirq.part.0+0x6b/0x80 kernel/softirq.c:337
       do_softirq kernel/softirq.c:329 [inline]
       __local_bh_enable_ip+0x76/0x80 kernel/softirq.c:189
       local_bh_enable include/linux/bottom_half.h:32 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
       ip6_finish_output2+0x7bb/0xec0 net/ipv6/ip6_output.c:117
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       ip6_local_out+0x74/0x90 net/ipv6/output_core.c:179
       ip6_send_skb+0x53/0x110 net/ipv6/ip6_output.c:1795
       udp_v6_send_skb.isra.0+0x3ec/0xa70 net/ipv6/udp.c:1173
       udpv6_sendmsg+0x1906/0x1c20 net/ipv6/udp.c:1471
       inet6_sendmsg+0x6d/0x90 net/ipv6/af_inet6.c:576
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0x9f/0xc0 net/socket.c:657
       ___sys_sendmsg+0x2b7/0x5d0 net/socket.c:2311
       __sys_sendmmsg+0x123/0x350 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x64/0x80 net/socket.c:2439
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 21817 Comm: syz-executor.2 Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: d518d2ed ("net/sched: fix race between deactivation and dequeue for NOLOCK qdisc")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Davide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90b2be27
  7. 26 10月, 2019 2 次提交
    • V
      net: sch_generic: Use pfifo_fast as fallback scheduler for CAN hardware · 546b85bb
      Vincent Prince 提交于
      There is networking hardware that isn't based on Ethernet for layers 1 and 2.
      
      For example CAN.
      
      CAN is a multi-master serial bus standard for connecting Electronic Control
      Units [ECUs] also known as nodes. A frame on the CAN bus carries up to 8 bytes
      of payload. Frame corruption is detected by a CRC. However frame loss due to
      corruption is possible, but a quite unusual phenomenon.
      
      While fq_codel works great for TCP/IP, it doesn't for CAN. There are a lot of
      legacy protocols on top of CAN, which are not build with flow control or high
      CAN frame drop rates in mind.
      
      When using fq_codel, as soon as the queue reaches a certain delay based length,
      skbs from the head of the queue are silently dropped. Silently meaning that the
      user space using a send() or similar syscall doesn't get an error. However
      TCP's flow control algorithm will detect dropped packages and adjust the
      bandwidth accordingly.
      
      When using fq_codel and sending raw frames over CAN, which is the common use
      case, the user space thinks the package has been sent without problems, because
      send() returned without an error. pfifo_fast will drop skbs, if the queue
      length exceeds the maximum. But with this scheduler the skbs at the tail are
      dropped, an error (-ENOBUFS) is propagated to user space. So that the user
      space can slow down the package generation.
      
      On distributions, where fq_codel is made default via CONFIG_DEFAULT_NET_SCH
      during compile time, or set default during runtime with sysctl
      net.core.default_qdisc (see [1]), we get a bad user experience. In my test case
      with pfifo_fast, I can transfer thousands of million CAN frames without a frame
      drop. On the other hand with fq_codel there is more then one lost CAN frame per
      thousand frames.
      
      As pointed out fq_codel is not suited for CAN hardware, so this patch changes
      attach_one_default_qdisc() to use pfifo_fast for "ARPHRD_CAN" network devices.
      
      During transition of a netdev from down to up state the default queuing
      discipline is attached by attach_default_qdiscs() with the help of
      attach_one_default_qdisc(). This patch modifies attach_one_default_qdisc() to
      attach the pfifo_fast (pfifo_fast_ops) if the network device type is
      "ARPHRD_CAN".
      
      [1] https://github.com/systemd/systemd/issues/9194Suggested-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NVincent Prince <vincent.prince.fr@gmail.com>
      Acked-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      546b85bb
    • V
      net: sch_generic: Use pfifo_fast as fallback scheduler for CAN hardware · fa784f2a
      Vincent Prince 提交于
      There is networking hardware that isn't based on Ethernet for layers 1 and 2.
      
      For example CAN.
      
      CAN is a multi-master serial bus standard for connecting Electronic Control
      Units [ECUs] also known as nodes. A frame on the CAN bus carries up to 8 bytes
      of payload. Frame corruption is detected by a CRC. However frame loss due to
      corruption is possible, but a quite unusual phenomenon.
      
      While fq_codel works great for TCP/IP, it doesn't for CAN. There are a lot of
      legacy protocols on top of CAN, which are not build with flow control or high
      CAN frame drop rates in mind.
      
      When using fq_codel, as soon as the queue reaches a certain delay based length,
      skbs from the head of the queue are silently dropped. Silently meaning that the
      user space using a send() or similar syscall doesn't get an error. However
      TCP's flow control algorithm will detect dropped packages and adjust the
      bandwidth accordingly.
      
      When using fq_codel and sending raw frames over CAN, which is the common use
      case, the user space thinks the package has been sent without problems, because
      send() returned without an error. pfifo_fast will drop skbs, if the queue
      length exceeds the maximum. But with this scheduler the skbs at the tail are
      dropped, an error (-ENOBUFS) is propagated to user space. So that the user
      space can slow down the package generation.
      
      On distributions, where fq_codel is made default via CONFIG_DEFAULT_NET_SCH
      during compile time, or set default during runtime with sysctl
      net.core.default_qdisc (see [1]), we get a bad user experience. In my test case
      with pfifo_fast, I can transfer thousands of million CAN frames without a frame
      drop. On the other hand with fq_codel there is more then one lost CAN frame per
      thousand frames.
      
      As pointed out fq_codel is not suited for CAN hardware, so this patch changes
      attach_one_default_qdisc() to use pfifo_fast for "ARPHRD_CAN" network devices.
      
      During transition of a netdev from down to up state the default queuing
      discipline is attached by attach_default_qdiscs() with the help of
      attach_one_default_qdisc(). This patch modifies attach_one_default_qdisc() to
      attach the pfifo_fast (pfifo_fast_ops) if the network device type is
      "ARPHRD_CAN".
      
      [1] https://github.com/systemd/systemd/issues/9194Signed-off-by: NVincent Prince <vincent.prince.fr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa784f2a
  8. 25 10月, 2019 1 次提交
    • T
      net: core: add generic lockdep keys · ab92d68f
      Taehee Yoo 提交于
      Some interface types could be nested.
      (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
      These interface types should set lockdep class because, without lockdep
      class key, lockdep always warn about unexisting circular locking.
      
      In the current code, these interfaces have their own lockdep class keys and
      these manage itself. So that there are so many duplicate code around the
      /driver/net and /net/.
      This patch adds new generic lockdep keys and some helper functions for it.
      
      This patch does below changes.
      a) Add lockdep class keys in struct net_device
         - qdisc_running, xmit, addr_list, qdisc_busylock
         - these keys are used as dynamic lockdep key.
      b) When net_device is being allocated, lockdep keys are registered.
         - alloc_netdev_mqs()
      c) When net_device is being free'd llockdep keys are unregistered.
         - free_netdev()
      d) Add generic lockdep key helper function
         - netdev_register_lockdep_key()
         - netdev_unregister_lockdep_key()
         - netdev_update_lockdep_key()
      e) Remove unnecessary generic lockdep macro and functions
      f) Remove unnecessary lockdep code of each interfaces.
      
      After this patch, each interface modules don't need to maintain
      their lockdep keys.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab92d68f
  9. 18 10月, 2019 1 次提交
  10. 03 10月, 2019 1 次提交
    • E
      net_sched: remove need_resched() from qdisc_run() · b60fa1c5
      Eric Dumazet 提交于
      The introduction of this schedule point was done in commit
      2ba2506c ("[NET]: Add preemption point in qdisc_run")
      at a time the loop was not bounded.
      
      Then later in commit d5b8aa1d ("net_sched: fix dequeuer fairness")
      we added a limit on the number of packets.
      
      Now is the time to remove the schedule point, since the default
      limit of 64 packets matches the number of packets a typical NAPI
      poll can process in a row.
      
      This solves a latency problem for most TCP receivers under moderate load :
      
      1) host receives a packet.
         NET_RX_SOFTIRQ is raised by NIC hard IRQ handler
      
      2) __do_softirq() does its first loop, handling NET_RX_SOFTIRQ
         and calling the driver napi->loop() function
      
      3) TCP stores the skb in socket receive queue:
      
      4) TCP calls sk->sk_data_ready() and wakeups a user thread
         waiting for EPOLLIN (as a result, need_resched() might now be true)
      
      5) TCP cooks an ACK and sends it.
      
      6) qdisc_run() processes one packet from qdisc, and sees need_resched(),
         this raises NET_TX_SOFTIRQ (even if there are no more packets in
         the qdisc)
      
      Then we go back to the __do_softirq() in 2), and we see that new
      softirqs were raised. Since need_resched() is true, we end up waking
      ksoftirqd in this path :
      
          if (pending) {
                  if (time_before(jiffies, end) && !need_resched() &&
                      --max_restart)
                          goto restart;
      
                  wakeup_softirqd();
          }
      
      So we have many wakeups of ksoftirqd kernel threads,
      and more calls to qdisc_run() with associated lock overhead.
      
      Note that another way to solve the issue would be to change TCP
      to first send the ACK packet, then signal the EPOLLIN,
      but this changes P99 latencies, as sending the ACK packet
      can add a long delay.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b60fa1c5
  11. 16 9月, 2019 1 次提交
  12. 06 9月, 2019 1 次提交
    • E
      net: sched: fix reordering issues · b88dd52c
      Eric Dumazet 提交于
      Whenever MQ is not used on a multiqueue device, we experience
      serious reordering problems. Bisection found the cited
      commit.
      
      The issue can be described this way :
      
      - A single qdisc hierarchy is shared by all transmit queues.
        (eg : tc qdisc replace dev eth0 root fq_codel)
      
      - When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting
        a different transmit queue than the one used to build a packet train,
        we stop building the current list and save the 'bad' skb (P1) in a
        special queue. (bad_txq)
      
      - When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this
        skb (P1), it checks if the associated transmit queues is still in frozen
        state. If the queue is still blocked (by BQL or NIC tx ring full),
        we leave the skb in bad_txq and return NULL.
      
      - dequeue_skb() calls q->dequeue() to get another packet (P2)
      
        The other packet can target the problematic queue (that we found
        in frozen state for the bad_txq packet), but another cpu just ran
        TX completion and made room in the txq that is now ready to accept
        new packets.
      
      - Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent
        at next round. In practice P2 is the lead of a big packet train
        (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/
      
      To solve this problem, we have to block the dequeue process as long
      as the first packet in bad_txq can not be sent. Reordering issues
      disappear and no side effects have been seen.
      
      Fixes: a53851e2 ("net: sched: explicit locking in gso_cpu fallback")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b88dd52c
  13. 29 8月, 2019 2 次提交
    • D
      net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue · 092e22e5
      Davide Caratti 提交于
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we can't assume anymore that
      per-cpu counters are there in the error path of skb_array_produce().
      Otherwise, the following splat can be seen:
      
       Unable to handle kernel paging request at virtual address 0000600dea430008
       Mem abort info:
         ESR = 0x96000005
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000005
         CM = 0, WnR = 0
       user pgtable: 64k pages, 48-bit VAs, pgdp = 000000007b97530e
       [0000600dea430008] pgd=0000000000000000, pud=0000000000000000
       Internal error: Oops: 96000005 [#1] SMP
      [...]
       pstate: 10000005 (nzcV daif -PAN -UAO)
       pc : pfifo_fast_enqueue+0x524/0x6e8
       lr : pfifo_fast_enqueue+0x46c/0x6e8
       sp : ffff800d39376fe0
       x29: ffff800d39376fe0 x28: 1ffff001a07d1e40
       x27: ffff800d03e8f188 x26: ffff800d03e8f200
       x25: 0000000000000062 x24: ffff800d393772f0
       x23: 0000000000000000 x22: 0000000000000403
       x21: ffff800cca569a00 x20: ffff800d03e8ee00
       x19: ffff800cca569a10 x18: 00000000000000bf
       x17: 0000000000000000 x16: 0000000000000000
       x15: 0000000000000000 x14: ffff1001a726edd0
       x13: 1fffe4000276a9a4 x12: 0000000000000000
       x11: dfff200000000000 x10: ffff800d03e8f1a0
       x9 : 0000000000000003 x8 : 0000000000000000
       x7 : 00000000f1f1f1f1 x6 : ffff1001a726edea
       x5 : ffff800cca56a53c x4 : 1ffff001bf9a8003
       x3 : 1ffff001bf9a8003 x2 : 1ffff001a07d1dcb
       x1 : 0000600dea430000 x0 : 0000600dea430008
       Process ping (pid: 6067, stack limit = 0x00000000dc0aa557)
       Call trace:
        pfifo_fast_enqueue+0x524/0x6e8
        htb_enqueue+0x660/0x10e0 [sch_htb]
        __dev_queue_xmit+0x123c/0x2de0
        dev_queue_xmit+0x24/0x30
        ip_finish_output2+0xc48/0x1720
        ip_finish_output+0x548/0x9d8
        ip_output+0x334/0x788
        ip_local_out+0x90/0x138
        ip_send_skb+0x44/0x1d0
        ip_push_pending_frames+0x5c/0x78
        raw_sendmsg+0xed8/0x28d0
        inet_sendmsg+0xc4/0x5c0
        sock_sendmsg+0xac/0x108
        __sys_sendto+0x1ac/0x2a0
        __arm64_sys_sendto+0xc4/0x138
        el0_svc_handler+0x13c/0x298
        el0_svc+0x8/0xc
       Code: f9402e80 d538d081 91002000 8b010000 (885f7c03)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Fixes: 8a53e616 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Stefano Brivio <sbrivio@redhat.com>
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      092e22e5
    • D
      net/sched: pfifo_fast: fix wrong dereference when qdisc is reset · 04d37cf4
      Davide Caratti 提交于
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
      counters are present when 'reset()' is called for pfifo_fast qdiscs.
      Otherwise, the following script:
      
       # tc q a dev lo handle 1: root htb default 100
       # tc c a dev lo parent 1: classid 1:100 htb \
       > rate 95Mbit ceil 100Mbit burst 64k
       [...]
       # tc f a dev lo parent 1: protocol arp basic classid 1:100
       [...]
       # tc q a dev lo parent 1:100 handle 100: pfifo_fast
       [...]
       # tc q d dev lo root
      
      can generate the following splat:
      
       Unable to handle kernel paging request at virtual address dfff2c01bd148000
       Mem abort info:
         ESR = 0x96000004
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000004
         CM = 0, WnR = 0
       [dfff2c01bd148000] address between user and kernel address ranges
       Internal error: Oops: 96000004 [#1] SMP
       [...]
       pstate: 80000005 (Nzcv daif -PAN -UAO)
       pc : pfifo_fast_reset+0x280/0x4d8
       lr : pfifo_fast_reset+0x21c/0x4d8
       sp : ffff800d09676fa0
       x29: ffff800d09676fa0 x28: ffff200012ee22e4
       x27: dfff200000000000 x26: 0000000000000000
       x25: ffff800ca0799958 x24: ffff1001940f332b
       x23: 0000000000000007 x22: ffff200012ee1ab8
       x21: 0000600de8a40000 x20: 0000000000000000
       x19: ffff800ca0799900 x18: 0000000000000000
       x17: 0000000000000002 x16: 0000000000000000
       x15: 0000000000000000 x14: 0000000000000000
       x13: 0000000000000000 x12: ffff1001b922e6e2
       x11: 1ffff001b922e6e1 x10: 0000000000000000
       x9 : 1ffff001b922e6e1 x8 : dfff200000000000
       x7 : 0000000000000000 x6 : 0000000000000000
       x5 : 1fffe400025dc45c x4 : 1fffe400025dc357
       x3 : 00000c01bd148000 x2 : 0000600de8a40000
       x1 : 0000000000000007 x0 : 0000600de8a40004
       Call trace:
        pfifo_fast_reset+0x280/0x4d8
        qdisc_reset+0x6c/0x370
        htb_reset+0x150/0x3b8 [sch_htb]
        qdisc_reset+0x6c/0x370
        dev_deactivate_queue.constprop.5+0xe0/0x1a8
        dev_deactivate_many+0xd8/0x908
        dev_deactivate+0xe4/0x190
        qdisc_graft+0x88c/0xbd0
        tc_get_qdisc+0x418/0x8a8
        rtnetlink_rcv_msg+0x3a8/0xa78
        netlink_rcv_skb+0x18c/0x328
        rtnetlink_rcv+0x28/0x38
        netlink_unicast+0x3c4/0x538
        netlink_sendmsg+0x538/0x9a0
        sock_sendmsg+0xac/0xf8
        ___sys_sendmsg+0x53c/0x658
        __sys_sendmsg+0xc8/0x140
        __arm64_sys_sendmsg+0x74/0xa8
        el0_svc_handler+0x164/0x468
        el0_svc+0x10/0x14
       Code: 910012a0 92400801 d343fc03 11000c21 (38fb6863)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Changes since v1:
       - coding style improvements, thanks to Stefano Brivio
      
      Fixes: 8a53e616 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04d37cf4
  14. 31 5月, 2019 1 次提交
  15. 04 5月, 2019 1 次提交
  16. 11 4月, 2019 3 次提交
    • P
      Revert: "net: sched: put back q.qlen into a single location" · 73eb628d
      Paolo Abeni 提交于
      This revert commit 46b1c18f ("net: sched: put back q.qlen into
      a single location").
      After the previous patch, when a NOLOCK qdisc is enslaved to a
      locking qdisc it switches to global stats accounting. As a consequence,
      when a classful qdisc accesses directly a child qdisc's qlen, such
      qdisc is not doing per CPU accounting and qlen value is consistent.
      
      In the control path nobody uses directly qlen since commit
      e5f0e8f8 ("net: sched: introduce and use qdisc tree flush/purge
      helpers"), so we can remove the contented atomic ops from the
      datapath.
      
      v1 -> v2:
       - complete the qdisc_qstats_atomic_qlen_dec() ->
         qdisc_qstats_cpu_qlen_dec() replacement, fix build issue
       - more descriptive commit message
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73eb628d
    • P
      net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too · 8a53e616
      Paolo Abeni 提交于
      Since stats updating is always consistent with TCQ_F_CPUSTATS flag,
      we can disable it at qdisc creation time flipping such bit.
      
      In my experiments, if the NOLOCK flag is cleared, per CPU stats
      accounting does not give any measurable performance gain, but it
      waste some memory.
      
      Let's clear TCQ_F_CPUSTATS together with NOLOCK, when enslaving
      a NOLOCK qdisc to 'lock' one.
      
      Use stats update helper inside pfifo_fast, to cope correctly with
      TCQ_F_CPUSTATS flag change.
      
      As a side effect, q.qlen value for any child qdiscs is always
      consistent for all lock classfull qdiscs.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a53e616
    • P
      net: sched: always do stats accounting according to TCQ_F_CPUSTATS · 9c01c9f1
      Paolo Abeni 提交于
      The core sched implementation checks independently for NOLOCK flag
      to acquire/release the root spin lock and for qdisc_is_percpu_stats()
      to account per CPU values in many places.
      
      This change update the last few places checking the TCQ_F_NOLOCK to
      do per CPU stats accounting according to qdisc_is_percpu_stats()
      value.
      
      The above allows to clean dev_requeue_skb() implementation a bit
      and makes stats update always consistent with a single flag.
      
      v1 -> v2:
       - do not move qdisc_is_empty definition, fix build issue
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c01c9f1
  17. 24 3月, 2019 1 次提交
  18. 03 3月, 2019 1 次提交
    • E
      net: sched: put back q.qlen into a single location · 46b1c18f
      Eric Dumazet 提交于
      In the series fc8b81a5 ("Merge branch 'lockless-qdisc-series'")
      John made the assumption that the data path had no need to read
      the qdisc qlen (number of packets in the qdisc).
      
      It is true when pfifo_fast is used as the root qdisc, or as direct MQ/MQPRIO
      children.
      
      But pfifo_fast can be used as leaf in class full qdiscs, and existing
      logic needs to access the child qlen in an efficient way.
      
      HTB breaks badly, since it uses cl->leaf.q->q.qlen in :
        htb_activate() -> WARN_ON()
        htb_dequeue_tree() to decide if a class can be htb_deactivated
        when it has no more packets.
      
      HFSC, DRR, CBQ, QFQ have similar issues, and some calls to
      qdisc_tree_reduce_backlog() also read q.qlen directly.
      
      Using qdisc_qlen_sum() (which iterates over all possible cpus)
      in the data path is a non starter.
      
      It seems we have to put back qlen in a central location,
      at least for stable kernels.
      
      For all qdisc but pfifo_fast, qlen is guarded by the qdisc lock,
      so the existing q.qlen{++|--} are correct.
      
      For 'lockless' qdisc (pfifo_fast so far), we need to use atomic_{inc|dec}()
      because the spinlock might be not held (for example from
      pfifo_fast_enqueue() and pfifo_fast_dequeue())
      
      This patch adds atomic_qlen (in the same location than qlen)
      and renames the following helpers, since we want to express
      they can be used without qdisc lock, and that qlen is no longer percpu.
      
      - qdisc_qstats_cpu_qlen_dec -> qdisc_qstats_atomic_qlen_dec()
      - qdisc_qstats_cpu_qlen_inc -> qdisc_qstats_atomic_qlen_inc()
      
      Later (net-next) we might revert this patch by tracking all these
      qlen uses and replace them by a more efficient method (not having
      to access a precise qlen, but an empty/non_empty status that might
      be less expensive to maintain/track).
      
      Another possibility is to have a legacy pfifo_fast version that would
      be used when used a a child qdisc, since the parent qdisc needs
      a spinlock anyway. But then, future lockless qdiscs would also
      have the same problem.
      
      Fixes: 7e66016f ("net: sched: helpers to sum qlen and qlen for per cpu logic")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46b1c18f
  19. 27 2月, 2019 1 次提交
  20. 13 2月, 2019 1 次提交
    • V
      net: sched: protect filter_chain list with filter_chain_lock mutex · ed76f5ed
      Vlad Buslov 提交于
      Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain
      when accessing filter_chain list, instead of relying on rtnl lock.
      Dereference filter_chain with tcf_chain_dereference() lockdep macro to
      verify that all users of chain_list have the lock taken.
      
      Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
      all necessary code while holding chain lock in order to prevent
      invalidation of chain_info structure by potential concurrent change. This
      also serializes calls to tcf_chain0_head_change(), which allows head change
      callbacks to rely on filter_chain_lock for synchronization instead of rtnl
      mutex.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed76f5ed
  21. 12 2月, 2019 1 次提交
  22. 02 12月, 2018 1 次提交
    • P
      net/sched: Replace call_rcu_bh() and rcu_barrier_bh() · ae0e3349
      Paul E. McKenney 提交于
      Now that call_rcu()'s callback is not invoked until after bh-disable
      regions of code have completed (in addition to explicitly marked
      RCU read-side critical sections), call_rcu() can be used in place
      of call_rcu_bh().  Similarly, rcu_barrier() can be used in place o
      frcu_barrier_bh().  This commit therefore makes these changes.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: <netdev@vger.kernel.org>
      ae0e3349
  23. 11 10月, 2018 1 次提交
  24. 29 9月, 2018 1 次提交
  25. 26 9月, 2018 2 次提交
  26. 11 9月, 2018 1 次提交
  27. 01 6月, 2018 1 次提交
    • S
      net: remove bypassed check in sch_direct_xmit() · 4341f830
      Song Liu 提交于
      Checking netif_xmit_frozen_or_stopped() at the end of sch_direct_xmit()
      is being bypassed. This is because "ret" from sch_direct_xmit() will be
      either NETDEV_TX_OK or NETDEV_TX_BUSY, and only ret == NETDEV_TX_OK == 0
      will reach the condition:
      
          if (ret && netif_xmit_frozen_or_stopped(txq))
              return false;
      
      This patch cleans up the code by removing the whole condition.
      
      For more discussion about this, please refer to
         https://marc.info/?t=152727195700008Signed-off-by: NSong Liu <songliubraving@fb.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4341f830
  28. 18 5月, 2018 2 次提交
  29. 17 5月, 2018 1 次提交
    • P
      sched: manipulate __QDISC_STATE_RUNNING in qdisc_run_* helpers · 32f7b44d
      Paolo Abeni 提交于
      Currently NOLOCK qdiscs pay a measurable overhead to atomically
      manipulate the __QDISC_STATE_RUNNING. Such bit is flipped twice per
      packet in the uncontended scenario with packet rate below the
      line rate: on packed dequeue and on the next, failing dequeue attempt.
      
      This changeset moves the bit manipulation into the qdisc_run_{begin,end}
      helpers, so that the bit is now flipped only once per packet, with
      measurable performance improvement in the uncontended scenario.
      
      This also allows simplifying the qdisc teardown code path - since
      qdisc_is_running() is now effective for each qdisc type - and avoid a
      possible race between qdisc_run() and dev_deactivate_many(), as now
      the some_qdisc_is_busy() can properly detect NOLOCK qdiscs being busy
      dequeuing packets.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32f7b44d
  30. 27 3月, 2018 1 次提交
    • J
      net: sched, fix OOO packets with pfifo_fast · eb82a994
      John Fastabend 提交于
      After the qdisc lock was dropped in pfifo_fast we allow multiple
      enqueue threads and dequeue threads to run in parallel. On the
      enqueue side the skb bit ooo_okay is used to ensure all related
      skbs are enqueued in-order. On the dequeue side though there is
      no similar logic. What we observe is with fewer queues than CPUs
      it is possible to re-order packets when two instances of
      __qdisc_run() are running in parallel. Each thread will dequeue
      a skb and then whichever thread calls the ndo op first will
      be sent on the wire. This doesn't typically happen because
      qdisc_run() is usually triggered by the same core that did the
      enqueue. However, drivers will trigger __netif_schedule()
      when queues are transitioning from stopped to awake using the
      netif_tx_wake_* APIs. When this happens netif_schedule() calls
      qdisc_run() on the same CPU that did the netif_tx_wake_* which
      is usually done in the interrupt completion context. This CPU
      is selected with the irq affinity which is unrelated to the
      enqueue operations.
      
      To resolve this we add a RUNNING bit to the qdisc to ensure
      only a single dequeue per qdisc is running. Enqueue and dequeue
      operations can still run in parallel and also on multi queue
      NICs we can still have a dequeue in-flight per qdisc, which
      is typically per CPU.
      
      Fixes: c5ad119f ("net: sched: pfifo_fast use skb_array")
      Reported-by: NJakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb82a994
  31. 18 3月, 2018 1 次提交
    • E
      net: sched: fix uses after free · cce6294c
      Eric Dumazet 提交于
      syzbot reported one use-after-free in pfifo_fast_enqueue() [1]
      
      Issue here is that we can not reuse skb after a successful skb_array_produce()
      since another cpu might have consumed it already.
      
      I believe a similar problem exists in try_bulk_dequeue_skb_slow()
      in case we put an skb into qdisc_enqueue_skb_bad_txq() for lockless qdisc.
      
      [1]
      BUG: KASAN: use-after-free in qdisc_pkt_len include/net/sch_generic.h:610 [inline]
      BUG: KASAN: use-after-free in qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
      BUG: KASAN: use-after-free in pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
      Read of size 4 at addr ffff8801cede37e8 by task syzkaller717588/5543
      
      CPU: 1 PID: 5543 Comm: syzkaller717588 Not tainted 4.16.0-rc4+ #265
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23c/0x360 mm/kasan/report.c:412
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
       qdisc_pkt_len include/net/sch_generic.h:610 [inline]
       qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
       pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
       __dev_xmit_skb net/core/dev.c:3216 [inline]
      
      Fixes: c5ad119f ("net: sched: pfifo_fast use skb_array")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: syzbot+ed43b6903ab968b16f54@syzkaller.appspotmail.com
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc:	Cong Wang <xiyou.wangcong@gmail.com>
      Cc:	Jiri Pirko <jiri@resnulli.us>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cce6294c
  32. 30 1月, 2018 2 次提交
    • C
      net_sched: implement ->change_tx_queue_len() for pfifo_fast · 7007ba63
      Cong Wang 提交于
      pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len,
      so we have to resize skb array when we change tx_queue_len.
      
      Other qdiscs which read tx_queue_len are fine because they
      all save it to sch->limit or somewhere else in qdisc during init.
      They don't have to implement this, it is nicer if they do so
      that users don't have to re-configure qdisc after changing
      tx_queue_len.
      
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7007ba63
    • C
      net_sched: plug in qdisc ops change_tx_queue_len · 48bfd55e
      Cong Wang 提交于
      Introduce a new qdisc ops ->change_tx_queue_len() so that
      each qdisc could decide how to implement this if it wants.
      Previously we simply read dev->tx_queue_len, after pfifo_fast
      switches to skb array, we need this API to resize the skb array
      when we change dev->tx_queue_len.
      
      To avoid handling race conditions with TX BH, we need to
      deactivate all TX queues before change the value and bring them
      back after we are done, this also makes implementation easier.
      
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48bfd55e