1. 07 8月, 2020 2 次提交
    • X
      drivers/net/wan/lapbether: Added needed_headroom and a skb->len check · c7ca03c2
      Xie He 提交于
      1. Added a skb->len check
      
      This driver expects upper layers to include a pseudo header of 1 byte
      when passing down a skb for transmission. This driver will read this
      1-byte header. This patch added a skb->len check before reading the
      header to make sure the header exists.
      
      2. Changed to use needed_headroom instead of hard_header_len to request
      necessary headroom to be allocated
      
      In net/packet/af_packet.c, the function packet_snd first reserves a
      headroom of length (dev->hard_header_len + dev->needed_headroom).
      Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header,
      which calls dev->header_ops->create, to create the link layer header.
      If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of
      length (dev->hard_header_len), and assumes the user to provide the
      appropriate link layer header.
      
      So according to the logic of af_packet.c, dev->hard_header_len should
      be the length of the header that would be created by
      dev->header_ops->create.
      
      However, this driver doesn't provide dev->header_ops, so logically
      dev->hard_header_len should be 0.
      
      So we should use dev->needed_headroom instead of dev->hard_header_len
      to request necessary headroom to be allocated.
      
      This change fixes kernel panic when this driver is used with AF_PACKET
      SOCK_RAW sockets.
      
      Call stack when panic:
      
      [  168.399197] skbuff: skb_under_panic: text:ffffffff819d95fb len:20
      put:14 head:ffff8882704c0a00 data:ffff8882704c09fd tail:0x11 end:0xc0
      dev:veth0
      ...
      [  168.399255] Call Trace:
      [  168.399259]  skb_push.cold+0x14/0x24
      [  168.399262]  eth_header+0x2b/0xc0
      [  168.399267]  lapbeth_data_transmit+0x9a/0xb0 [lapbether]
      [  168.399275]  lapb_data_transmit+0x22/0x2c [lapb]
      [  168.399277]  lapb_transmit_buffer+0x71/0xb0 [lapb]
      [  168.399279]  lapb_kick+0xe3/0x1c0 [lapb]
      [  168.399281]  lapb_data_request+0x76/0xc0 [lapb]
      [  168.399283]  lapbeth_xmit+0x56/0x90 [lapbether]
      [  168.399286]  dev_hard_start_xmit+0x91/0x1f0
      [  168.399289]  ? irq_init_percpu_irqstack+0xc0/0x100
      [  168.399291]  __dev_queue_xmit+0x721/0x8e0
      [  168.399295]  ? packet_parse_headers.isra.0+0xd2/0x110
      [  168.399297]  dev_queue_xmit+0x10/0x20
      [  168.399298]  packet_sendmsg+0xbf0/0x19b0
      ......
      
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: Martin Schiller <ms@dev.tdt.de>
      Cc: Brian Norris <briannorris@chromium.org>
      Signed-off-by: NXie He <xie.he.0141@gmail.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7ca03c2
    • C
      net: hns3: fix spelling mistake "could'nt" -> "couldn't" · 8912fd6a
      Colin Ian King 提交于
      There is a spelling mistake in a dev_err message. Fix it.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8912fd6a
  2. 06 8月, 2020 6 次提交
    • D
      net: thunderx: initialize VF's mailbox mutex before first usage · c1055b76
      Dean Nelson 提交于
      A VF's mailbox mutex is not getting initialized by nicvf_probe() until after
      it is first used. And such usage is resulting in...
      
      [   28.270927] ------------[ cut here ]------------
      [   28.270934] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
      [   28.270980] WARNING: CPU: 9 PID: 675 at kernel/locking/mutex.c:938 __mutex_lock+0xdac/0x12f0
      [   28.270985] Modules linked in: ast(+) nicvf(+) i2c_algo_bit drm_vram_helper drm_ttm_helper ttm nicpf(+) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ixgbe(+) sg thunder_bgx mdio i2c_thunderx mdio_thunder thunder_xcv mdio_cavium dm_mirror dm_region_hash dm_log dm_mod
      [   28.271064] CPU: 9 PID: 675 Comm: systemd-udevd Not tainted 4.18.0+ #1
      [   28.271070] Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS F02 08/06/2019
      [   28.271078] pstate: 60000005 (nZCv daif -PAN -UAO)
      [   28.271086] pc : __mutex_lock+0xdac/0x12f0
      [   28.271092] lr : __mutex_lock+0xdac/0x12f0
      [   28.271097] sp : ffff800d42146fb0
      [   28.271103] x29: ffff800d42146fb0 x28: 0000000000000000
      [   28.271113] x27: ffff800d24361180 x26: dfff200000000000
      [   28.271122] x25: 0000000000000000 x24: 0000000000000002
      [   28.271132] x23: ffff20001597cc80 x22: ffff2000139e9848
      [   28.271141] x21: 0000000000000000 x20: 1ffff001a8428e0c
      [   28.271151] x19: ffff200015d5d000 x18: 1ffff001ae0f2184
      [   28.271160] x17: 0000000000000000 x16: 0000000000000000
      [   28.271170] x15: ffff800d70790c38 x14: ffff20001597c000
      [   28.271179] x13: ffff20001597cc80 x12: ffff040002b2f779
      [   28.271189] x11: 1fffe40002b2f778 x10: ffff040002b2f778
      [   28.271199] x9 : 0000000000000000 x8 : 00000000f1f1f1f1
      [   28.271208] x7 : 00000000f2f2f2f2 x6 : 0000000000000000
      [   28.271217] x5 : 1ffff001ae0f2186 x4 : 1fffe400027eb03c
      [   28.271227] x3 : dfff200000000000 x2 : ffff1001a8428dbe
      [   28.271237] x1 : c87fdfac7ea11d00 x0 : 0000000000000000
      [   28.271246] Call trace:
      [   28.271254]  __mutex_lock+0xdac/0x12f0
      [   28.271261]  mutex_lock_nested+0x3c/0x50
      [   28.271297]  nicvf_send_msg_to_pf+0x40/0x3a0 [nicvf]
      [   28.271316]  nicvf_register_misc_interrupt+0x20c/0x328 [nicvf]
      [   28.271334]  nicvf_probe+0x508/0xda0 [nicvf]
      [   28.271344]  local_pci_probe+0xc4/0x180
      [   28.271352]  pci_device_probe+0x3ec/0x528
      [   28.271363]  driver_probe_device+0x21c/0xb98
      [   28.271371]  device_driver_attach+0xe8/0x120
      [   28.271379]  __driver_attach+0xe0/0x2a0
      [   28.271386]  bus_for_each_dev+0x118/0x190
      [   28.271394]  driver_attach+0x48/0x60
      [   28.271401]  bus_add_driver+0x328/0x558
      [   28.271409]  driver_register+0x148/0x398
      [   28.271416]  __pci_register_driver+0x14c/0x1b0
      [   28.271437]  nicvf_init_module+0x54/0x10000 [nicvf]
      [   28.271447]  do_one_initcall+0x18c/0xc18
      [   28.271457]  do_init_module+0x18c/0x618
      [   28.271464]  load_module+0x2bc0/0x4088
      [   28.271472]  __se_sys_finit_module+0x110/0x188
      [   28.271479]  __arm64_sys_finit_module+0x70/0xa0
      [   28.271490]  el0_svc_handler+0x15c/0x380
      [   28.271496]  el0_svc+0x8/0xc
      [   28.271502] irq event stamp: 52649
      [   28.271513] hardirqs last  enabled at (52649): [<ffff200011b4d790>] _raw_spin_unlock_irqrestore+0xc0/0xd8
      [   28.271522] hardirqs last disabled at (52648): [<ffff200011b4d3c4>] _raw_spin_lock_irqsave+0x3c/0xf0
      [   28.271530] softirqs last  enabled at (52330): [<ffff200010082af4>] __do_softirq+0xacc/0x117c
      [   28.271540] softirqs last disabled at (52313): [<ffff20001019b354>] irq_exit+0x3cc/0x500
      [   28.271545] ---[ end trace a9b90324c8a0d4ee ]---
      
      This problem is resolved by moving the call to mutex_init() up earlier
      in nicvf_probe().
      
      Fixes: 609ea65c ("net: thunderx: add mutex to protect mailbox from concurrent calls for same VF")
      Signed-off-by: NDean Nelson <dnelson@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1055b76
    • O
      usb: hso: remove bogus check for EINPROGRESS · abaf00ff
      Oliver Neukum 提交于
      This check an inherent race. It opens a race where
      an error code has already been set or cleared yet
      the URB has not been given back. We cannot do
      such an optimization and must unlink unconditionally.
      Signed-off-by: NOliver Neukum <oneukum@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abaf00ff
    • O
      usb: hso: no complaint about kmalloc failure · 11c5f6d2
      Oliver Neukum 提交于
      If this fails, kmalloc() will print a report including
      a stack trace. There is no need for a separate complaint.
      Signed-off-by: NOliver Neukum <oneukum@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11c5f6d2
    • O
      hso: fix bailout in error case of probe · 5fcfb6d0
      Oliver Neukum 提交于
      The driver tries to reuse code for disconnect in case
      of a failed probe.
      If resources need to be freed after an error in probe, the
      netdev must not be freed because it has never been registered.
      Fix it by telling the helper which path we are in.
      Signed-off-by: NOliver Neukum <oneukum@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fcfb6d0
    • V
      net: dsa: sja1105: use detected device id instead of DT one on mismatch · 0b0e2997
      Vladimir Oltean 提交于
      Although we can detect the chip revision 100% at runtime, it is useful
      to specify it in the device tree compatible string too, because
      otherwise there would be no way to assess the correctness of device tree
      bindings statically, without booting a board (only some switch versions
      have internal RGMII delays and/or an SGMII port).
      
      But for testing the P/Q/R/S support, what I have is a reworked board
      with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
      want to keep a separate device tree blob just for this one-off board.
      Since just the chip has been replaced, its RGMII delay setup is
      inherently the same (meaning: delays added by the PHY on the slave
      ports, and by PCB traces on the fixed-link CPU port).
      
      For this board, I'd rather have the driver shout at me, but go ahead and
      use what it found even if it doesn't match what it's been told is there.
      
      [    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
      [    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
      [    3.005082] sja1105 spi0.1: Enabled switch tagging
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b0e2997
    • H
      Revert "vxlan: fix tos value before xmit" · a0dced17
      Hangbin Liu 提交于
      This reverts commit 71130f29.
      
      In commit 71130f29 ("vxlan: fix tos value before xmit") we want to
      make sure the tos value are filtered by RT_TOS() based on RFC1349.
      
             0     1     2     3     4     5     6     7
          +-----+-----+-----+-----+-----+-----+-----+-----+
          |   PRECEDENCE    |          TOS          | MBZ |
          +-----+-----+-----+-----+-----+-----+-----+-----+
      
      But RFC1349 has been obsoleted by RFC2474. The new DSCP field defined like
      
             0     1     2     3     4     5     6     7
          +-----+-----+-----+-----+-----+-----+-----+-----+
          |          DS FIELD, DSCP           | ECN FIELD |
          +-----+-----+-----+-----+-----+-----+-----+-----+
      
      So with
      
      IPTOS_TOS_MASK          0x1E
      RT_TOS(tos)		((tos)&IPTOS_TOS_MASK)
      
      the first 3 bits DSCP info will get lost.
      
      To take all the DSCP info in xmit, we should revert the patch and just push
      all tos bits to ip_tunnel_ecn_encap(), which will handling ECN field later.
      
      Fixes: 71130f29 ("vxlan: fix tos value before xmit")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0dced17
  3. 05 8月, 2020 12 次提交
    • C
      farsync: switch from 'pci_' to 'dma_' API · 4c900a6b
      Christophe JAILLET 提交于
      The wrappers in include/linux/pci-dma-compat.h should go away.
      
      The patch has been generated with the coccinelle script below and has been
      hand modified to replace GFP_ with a correct flag.
      It has been compile tested.
      
      When memory is allocated in 'fst_add_one()', GFP_KERNEL can be used
      because it is a probe function and no lock is acquired.
      
      @@
      @@
      -    PCI_DMA_BIDIRECTIONAL
      +    DMA_BIDIRECTIONAL
      
      @@
      @@
      -    PCI_DMA_TODEVICE
      +    DMA_TO_DEVICE
      
      @@
      @@
      -    PCI_DMA_FROMDEVICE
      +    DMA_FROM_DEVICE
      
      @@
      @@
      -    PCI_DMA_NONE
      +    DMA_NONE
      
      @@
      expression e1, e2, e3;
      @@
      -    pci_alloc_consistent(e1, e2, e3)
      +    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
      
      @@
      expression e1, e2, e3;
      @@
      -    pci_zalloc_consistent(e1, e2, e3)
      +    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_free_consistent(e1, e2, e3, e4)
      +    dma_free_coherent(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_map_single(e1, e2, e3, e4)
      +    dma_map_single(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_single(e1, e2, e3, e4)
      +    dma_unmap_single(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4, e5;
      @@
      -    pci_map_page(e1, e2, e3, e4, e5)
      +    dma_map_page(&e1->dev, e2, e3, e4, e5)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_page(e1, e2, e3, e4)
      +    dma_unmap_page(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_map_sg(e1, e2, e3, e4)
      +    dma_map_sg(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_sg(e1, e2, e3, e4)
      +    dma_unmap_sg(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
      +    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_single_for_device(e1, e2, e3, e4)
      +    dma_sync_single_for_device(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
      +    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
      +    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2;
      @@
      -    pci_dma_mapping_error(e1, e2)
      +    dma_mapping_error(&e1->dev, e2)
      
      @@
      expression e1, e2;
      @@
      -    pci_set_dma_mask(e1, e2)
      +    dma_set_mask(&e1->dev, e2)
      
      @@
      expression e1, e2;
      @@
      -    pci_set_consistent_dma_mask(e1, e2)
      +    dma_set_coherent_mask(&e1->dev, e2)
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c900a6b
    • C
      wan: wanxl: switch from 'pci_' to 'dma_' API · 24dd377a
      Christophe JAILLET 提交于
      The wrappers in include/linux/pci-dma-compat.h should go away.
      
      The patch has been generated with the coccinelle script below and has been
      hand modified to replace GFP_ with a correct flag.
      It has been compile tested.
      
      When memory is allocated in 'wanxl_pci_init_one()', GFP_KERNEL can be used
      because it is a probe function and no lock is acquired.
      Moreover, just a few lines above, GFP_KERNEL is already used.
      
      @@
      @@
      -    PCI_DMA_BIDIRECTIONAL
      +    DMA_BIDIRECTIONAL
      
      @@
      @@
      -    PCI_DMA_TODEVICE
      +    DMA_TO_DEVICE
      
      @@
      @@
      -    PCI_DMA_FROMDEVICE
      +    DMA_FROM_DEVICE
      
      @@
      @@
      -    PCI_DMA_NONE
      +    DMA_NONE
      
      @@
      expression e1, e2, e3;
      @@
      -    pci_alloc_consistent(e1, e2, e3)
      +    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
      
      @@
      expression e1, e2, e3;
      @@
      -    pci_zalloc_consistent(e1, e2, e3)
      +    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_free_consistent(e1, e2, e3, e4)
      +    dma_free_coherent(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_map_single(e1, e2, e3, e4)
      +    dma_map_single(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_single(e1, e2, e3, e4)
      +    dma_unmap_single(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4, e5;
      @@
      -    pci_map_page(e1, e2, e3, e4, e5)
      +    dma_map_page(&e1->dev, e2, e3, e4, e5)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_page(e1, e2, e3, e4)
      +    dma_unmap_page(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_map_sg(e1, e2, e3, e4)
      +    dma_map_sg(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_unmap_sg(e1, e2, e3, e4)
      +    dma_unmap_sg(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
      +    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_single_for_device(e1, e2, e3, e4)
      +    dma_sync_single_for_device(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
      +    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2, e3, e4;
      @@
      -    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
      +    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)
      
      @@
      expression e1, e2;
      @@
      -    pci_dma_mapping_error(e1, e2)
      +    dma_mapping_error(&e1->dev, e2)
      
      @@
      expression e1, e2;
      @@
      -    pci_set_dma_mask(e1, e2)
      +    dma_set_mask(&e1->dev, e2)
      
      @@
      expression e1, e2;
      @@
      -    pci_set_consistent_dma_mask(e1, e2)
      +    dma_set_coherent_mask(&e1->dev, e2)
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24dd377a
    • S
      hv_netvsc: do not use VF device if link is down · 7c9864bb
      Stephen Hemminger 提交于
      If the accelerated networking SRIOV VF device has lost carrier
      use the synthetic network device which is available as backup
      path. This is a rare case since if VF link goes down, normally
      the VMBus device will also loose external connectivity as well.
      But if the communication is between two VM's on the same host
      the VMBus device will still work.
      Reported-by: N"Shah, Ashish N" <ashish.n.shah@intel.com>
      Fixes: 0c195567 ("netvsc: transparent VF management")
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c9864bb
    • Y
      dpaa2-eth: Fix passing zero to 'PTR_ERR' warning · 02afa9c6
      YueHaibing 提交于
      Fix smatch warning:
      
      drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2419
       alloc_channel() warn: passing zero to 'ERR_PTR'
      
      setup_dpcon() should return ERR_PTR(err) instead of zero in error
      handling case.
      
      Fixes: d7f5a9d8 ("dpaa2-eth: defer probe on object allocate")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02afa9c6
    • S
      net: macb: Properly handle phylink on at91sam9x · f7ba7dbf
      Stefan Roese 提交于
      I just recently noticed that ethernet does not work anymore since v5.5
      on the GARDENA smart Gateway, which is based on the AT91SAM9G25.
      Debugging showed that the "GEM bits" in the NCFGR register are now
      unconditionally accessed, which is incorrect for the !macb_is_gem()
      case.
      
      This patch adds the macb_is_gem() checks back to the code
      (in macb_mac_config() & macb_mac_link_up()), so that the GEM register
      bits are not accessed in this case any more.
      
      Fixes: 7897b071 ("net: macb: convert to phylink")
      Signed-off-by: NStefan Roese <sr@denx.de>
      Cc: Reto Schneider <reto.schneider@husqvarnagroup.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7ba7dbf
    • X
      net: thunderx: use spin_lock_bh in nicvf_set_rx_mode_task() · bab9693a
      Xin Long 提交于
      A dead lock was triggered on thunderx driver:
      
              CPU0                    CPU1
              ----                    ----
         [01] lock(&(&nic->rx_mode_wq_lock)->rlock);
                                 [11] lock(&(&mc->mca_lock)->rlock);
                                 [12] lock(&(&nic->rx_mode_wq_lock)->rlock);
         [02] <Interrupt> lock(&(&mc->mca_lock)->rlock);
      
      The path for each is:
      
        [01] worker_thread() -> process_one_work() -> nicvf_set_rx_mode_task()
        [02] mld_ifc_timer_expire()
        [11] ipv6_add_dev() -> ipv6_dev_mc_inc() -> igmp6_group_added() ->
        [12] dev_mc_add() -> __dev_set_rx_mode() -> nicvf_set_rx_mode()
      
      To fix it, it needs to disable bh on [1], so that the timer on [2]
      wouldn't be triggered until rx_mode_wq_lock is released. So change
      to use spin_lock_bh() instead of spin_lock().
      
      Thanks to Paolo for helping with this.
      
      v1->v2:
        - post to netdev.
      Reported-by: NRafael P. <rparrazo@redhat.com>
      Tested-by: NDean Nelson <dnelson@redhat.com>
      Fixes: 469998c8 ("net: thunderx: prevent concurrent data re-writing by nicvf_set_rx_mode")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bab9693a
    • S
      geneve: Support for PMTU discovery on directly bridged links · c1a800e8
      Stefano Brivio 提交于
      If the interface is a bridge or Open vSwitch port, and we can't
      forward a packet because it exceeds the local PMTU estimate,
      trigger an ICMP or ICMPv6 reply to the sender, using the same
      interface to forward it back.
      
      If metadata collection is enabled, set destination and source
      addresses for the flow as if we were receiving the packet, so that
      Open vSwitch can match the ICMP error against the existing
      association.
      
      v2: Use netif_is_any_bridge_port() (David Ahern)
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1a800e8
    • S
      vxlan: Support for PMTU discovery on directly bridged links · fc68c995
      Stefano Brivio 提交于
      If the interface is a bridge or Open vSwitch port, and we can't
      forward a packet because it exceeds the local PMTU estimate,
      trigger an ICMP or ICMPv6 reply to the sender, using the same
      interface to forward it back.
      
      If metadata collection is enabled, reverse destination and source
      addresses, so that Open vSwitch is able to match this packet against
      the existing, reverse flow.
      
      v2: Use netif_is_any_bridge_port() (David Ahern)
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc68c995
    • S
      tunnels: PMTU discovery support for directly bridged IP packets · 4cb47a86
      Stefano Brivio 提交于
      It's currently possible to bridge Ethernet tunnels carrying IP
      packets directly to external interfaces without assigning them
      addresses and routes on the bridged network itself: this is the case
      for UDP tunnels bridged with a standard bridge or by Open vSwitch.
      
      PMTU discovery is currently broken with those configurations, because
      the encapsulation effectively decreases the MTU of the link, and
      while we are able to account for this using PMTU discovery on the
      lower layer, we don't have a way to relay ICMP or ICMPv6 messages
      needed by the sender, because we don't have valid routes to it.
      
      On the other hand, as a tunnel endpoint, we can't fragment packets
      as a general approach: this is for instance clearly forbidden for
      VXLAN by RFC 7348, section 4.3:
      
         VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
         fragment encapsulated VXLAN packets due to the larger frame size.
         The destination VTEP MAY silently discard such VXLAN fragments.
      
      The same paragraph recommends that the MTU over the physical network
      accomodates for encapsulations, but this isn't a practical option for
      complex topologies, especially for typical Open vSwitch use cases.
      
      Further, it states that:
      
         Other techniques like Path MTU discovery (see [RFC1191] and
         [RFC1981]) MAY be used to address this requirement as well.
      
      Now, PMTU discovery already works for routed interfaces, we get
      route exceptions created by the encapsulation device as they receive
      ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and
      we already rebuild those messages with the appropriate MTU and route
      them back to the sender.
      
      Add the missing bits for bridged cases:
      
      - checks in skb_tunnel_check_pmtu() to understand if it's appropriate
        to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and
        RFC 4443 section 2.4 for ICMPv6. This function is already called by
        UDP tunnels
      
      - a new function generating those ICMP or ICMPv6 replies. We can't
        reuse icmp_send() and icmp6_send() as we don't see the sender as a
        valid destination. This doesn't need to be generic, as we don't
        cover any other type of ICMP errors given that we only provide an
        encapsulation function to the sender
      
      While at it, make the MTU check in skb_tunnel_check_pmtu() accurate:
      we might receive GSO buffers here, and the passed headroom already
      includes the inner MAC length, so we don't have to account for it
      a second time (that would imply three MAC headers on the wire, but
      there are just two).
      
      This issue became visible while bridging IPv6 packets with 4500 bytes
      of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50
      bytes of encapsulation headroom, we would advertise MTU as 3950, and
      we would reject fragmented IPv6 datagrams of 3958 bytes size on the
      wire. We're exclusively dealing with network MTU here, though, so we
      could get Ethernet frames up to 3964 octets in that case.
      
      v2:
      - moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern)
      - split IPv4/IPv6 functions (David Ahern)
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cb47a86
    • J
      via-velocity: Use more typical logging styles · 93f4ddd6
      Joe Perches 提交于
      Use netdev_<level> in place of VELOCITY_PRT.
      Use pr_<level> in place of printk(KERN_<LEVEL>.
      
      Miscellanea:
      
      o Add pr_fmt to prefix pr_<level> output with "via-velocity: "
      o Remove now unused functions and macros
      o Realign some logging lines
      o Remove devname where pr_<level> is also used
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93f4ddd6
    • L
      hinic: add check for mailbox msg from VF · c8c29ec3
      Luo bin 提交于
      PF should check whether the cmd from VF is supported and its content
      is right before passing it to hw.
      Signed-off-by: NLuo bin <luobin9@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8c29ec3
    • L
      hinic: add generating mailbox random index support · 088c5f0d
      Luo bin 提交于
      add support to generate mailbox random id of VF to ensure that
      mailbox messages PF received are from the correct VF.
      Signed-off-by: NLuo bin <luobin9@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      088c5f0d
  4. 04 8月, 2020 20 次提交