1. 24 9月, 2021 9 次提交
  2. 23 9月, 2021 10 次提交
    • S
      atlantic: Fix issue in the pm resume flow. · 4d88c339
      Sudarsana Reddy Kalluru 提交于
      After fixing hibernation resume flow, another usecase was found which
      should be explicitly handled - resume when device is in "down" state.
      Invoke aq_nic_init jointly with aq_nic_start only if ndev was already
      up during suspend/hibernate. We still need to perform nic_deinit() if
      caller requests for it, to handle the freeze/resume scenarios.
      
      Fixes: 57f780f1 ("atlantic: Fix driver resume flow.")
      Signed-off-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d88c339
    • A
      net/mlx4_en: Don't allow aRFS for encapsulated packets · fdbccea4
      Aya Levin 提交于
      Driver doesn't support aRFS for encapsulated packets, return early error
      in such a case.
      
      Fixes: 1eb8c695 ("net/mlx4_en: Add accelerated RFS support")
      Signed-off-by: NAya Levin <ayal@nvidia.com>
      Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdbccea4
    • V
      net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled · acc64f52
      Vladimir Oltean 提交于
      The blamed commit made the fatally incorrect assumption that ports which
      aren't in the FORWARDING STP state should not have packets forwarded
      towards them, and that is all that needs to be done.
      
      However, that logic alone permits BLOCKING ports to forward to
      FORWARDING ports, which of course allows packet storms to occur when
      there is an L2 loop.
      
      The ocelot_get_bridge_fwd_mask should not only ask "what can the bridge
      do for you", but "what can you do for the bridge". This way, only
      FORWARDING ports forward to the other FORWARDING ports from the same
      bridging domain, and we are still compatible with the idea of multiple
      bridges.
      
      Fixes: df291e54 ("net: ocelot: support multiple bridges")
      Suggested-by: NColin Foster <colin.foster@in-advantage.com>
      Reported-by: NColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acc64f52
    • F
      net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries · e68daf61
      Felix Fietkau 提交于
      Sometimes multiple CLS_REPLACE calls are issued for the same connection.
      rhashtable_insert_fast does not check for these duplicates, so multiple
      hardware flow entries can be created.
      Fix this by checking for an existing entry early
      
      Fixes: 502e84e2 ("net: ethernet: mtk_eth_soc: add flow offloading support")
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      Signed-off-by: NIlya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e68daf61
    • V
      net: dsa: sja1105: stop using priv->vlan_aware · 9aad3e4e
      Vladimir Oltean 提交于
      Now that the sja1105 driver is finally sane enough again to stop having
      a ternary VLAN awareness state, we can remove priv->vlan_aware and query
      DSA for the ds->vlan_filtering value (for SJA1105, VLAN filtering is a
      global property).
      
      Also drop the paranoid checking that DSA calls ->port_vlan_filtering
      multiple times without the VLAN awareness state changing. It doesn't,
      the same check is present inside dsa_port_vlan_filtering too.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9aad3e4e
    • V
      net: dsa: sja1105: don't keep a persistent reference to the reset GPIO · 33e1501f
      Vladimir Oltean 提交于
      The driver only needs the reset GPIO for a very brief period, so instead
      of using devres and keeping the descriptor pointer inside priv, just use
      that descriptor inside the sja1105_hw_reset function and then let go of
      it.
      
      Also use gpiod_get_optional while at it, and error out on real errors
      (bad flags etc).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33e1501f
    • V
      net: dsa: sja1105: break dependency between dsa_port_is_sja1105 and switch driver · f5aef424
      Vladimir Oltean 提交于
      It's nice to be able to test a tagging protocol with dsa_loop, but not
      at the cost of losing the ability of building the tagging protocol and
      switch driver as modules, because as things stand, there is a circular
      dependency between the two. Tagging protocol drivers cannot depend on
      switch drivers, that is a hard fact.
      
      The reasoning behind the blamed patch was that accessing dp->priv should
      first make sure that the structure behind that pointer is what we really
      think it is.
      
      Currently the "sja1105" and "sja1110" tagging protocols only operate
      with the sja1105 switch driver, just like any other tagging protocol and
      switch combination. The only way to mix and match them is by modifying
      the code, and this applies to dsa_loop as well (by default that uses
      DSA_TAG_PROTO_NONE). So while in principle there is an issue, in
      practice there isn't one.
      
      Until we extend dsa_loop to allow user space configuration, treat the
      problem as a non-issue and just say that DSA ports found by tag_sja1105
      are always sja1105 ports, which is in fact true. But keep the
      dsa_port_is_sja1105 function so that it's easy to patch it during
      testing, and rely on dead code elimination.
      
      Fixes: 994d2cbb ("net: dsa: tag_sja1105: be dsa_loop-safe")
      Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5aef424
    • V
      net: dsa: move sja1110_process_meta_tstamp inside the tagging protocol driver · 6d709cad
      Vladimir Oltean 提交于
      The problem is that DSA tagging protocols really must not depend on the
      switch driver, because this creates a circular dependency at insmod
      time, and the switch driver will effectively not load when the tagging
      protocol driver is missing.
      
      The code was structured in the way it was for a reason, though. The DSA
      driver-facing API for PTP timestamping relies on the assumption that
      two-step TX timestamps are provided by the hardware in an out-of-band
      manner, typically by raising an interrupt and making that timestamp
      available inside some sort of FIFO which is to be accessed over
      SPI/MDIO/etc.
      
      So the API puts .port_txtstamp into dsa_switch_ops, because it is
      expected that the switch driver needs to save some state (like put the
      skb into a queue until its TX timestamp arrives).
      
      On SJA1110, TX timestamps are provided by the switch as Ethernet
      packets, so this makes them be received and processed by the tagging
      protocol driver. This in itself is great, because the timestamps are
      full 64-bit and do not require reconstruction, and since Ethernet is the
      fastest I/O method available to/from the switch, PTP timestamps arrive
      very quickly, no matter how bottlenecked the SPI connection is, because
      SPI interaction is not needed at all.
      
      DSA's code structure and strict isolation between the tagging protocol
      driver and the switch driver break the natural code organization.
      
      When the tagging protocol driver receives a packet which is classified
      as a metadata packet containing timestamps, it passes those timestamps
      one by one to the switch driver, which then proceeds to compare them
      based on the recorded timestamp ID that was generated in .port_txtstamp.
      
      The communication between the tagging protocol and the switch driver is
      done through a method exported by the switch driver, sja1110_process_meta_tstamp.
      To satisfy build requirements, we force a dependency to build the
      tagging protocol driver as a module when the switch driver is a module.
      However, as explained in the first paragraph, that causes the circular
      dependency.
      
      To solve this, move the skb queue from struct sja1105_private :: struct
      sja1105_ptp_data to struct sja1105_private :: struct sja1105_tagger_data.
      The latter is a data structure for which hacks have already been put
      into place to be able to create persistent storage per switch that is
      accessible from the tagging protocol driver (see sja1105_setup_ports).
      
      With the skb queue directly accessible from the tagging protocol driver,
      we can now move sja1110_process_meta_tstamp into the tagging driver
      itself, and avoid exporting a symbol.
      
      Fixes: 566b18c8 ("net: dsa: sja1105: implement TX timestamping for SJA1110")
      Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d709cad
    • V
      net: dsa: sja1105: remove sp->dp · 68a81bb2
      Vladimir Oltean 提交于
      It looks like this field was never used since its introduction in commit
      227d07a0 ("net: dsa: sja1105: Add support for traffic through
      standalone ports") remove it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68a81bb2
    • I
      nexthop: Fix memory leaks in nexthop notification chain listeners · 3106a084
      Ido Schimmel 提交于
      syzkaller discovered memory leaks [1] that can be reduced to the
      following commands:
      
       # ip nexthop add id 1 blackhole
       # devlink dev reload pci/0000:06:00.0
      
      As part of the reload flow, mlxsw will unregister its netdevs and then
      unregister from the nexthop notification chain. Before unregistering
      from the notification chain, mlxsw will receive delete notifications for
      nexthop objects using netdevs registered by mlxsw or their uppers. mlxsw
      will not receive notifications for nexthops using netdevs that are not
      dismantled as part of the reload flow. For example, the blackhole
      nexthop above that internally uses the loopback netdev as its nexthop
      device.
      
      One way to fix this problem is to have listeners flush their nexthop
      tables after unregistering from the notification chain. This is
      error-prone as evident by this patch and also not symmetric with the
      registration path where a listener receives a dump of all the existing
      nexthops.
      
      Therefore, fix this problem by replaying delete notifications for the
      listener being unregistered. This is symmetric to the registration path
      and also consistent with the netdev notification chain.
      
      The above means that unregister_nexthop_notifier(), like
      register_nexthop_notifier(), will have to take RTNL in order to iterate
      over the existing nexthops and that any callers of the function cannot
      hold RTNL. This is true for mlxsw and netdevsim, but not for the VXLAN
      driver. To avoid a deadlock, change the latter to unregister its nexthop
      listener without holding RTNL, making it symmetric to the registration
      path.
      
      [1]
      unreferenced object 0xffff88806173d600 (size 512):
        comm "syz-executor.0", pid 1290, jiffies 4295583142 (age 143.507s)
        hex dump (first 32 bytes):
          41 9d 1e 60 80 88 ff ff 08 d6 73 61 80 88 ff ff  A..`......sa....
          08 d6 73 61 80 88 ff ff 01 00 00 00 00 00 00 00  ..sa............
        backtrace:
          [<ffffffff81a6b576>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<ffffffff81a6b576>] slab_post_alloc_hook+0x96/0x490 mm/slab.h:522
          [<ffffffff81a716d3>] slab_alloc_node mm/slub.c:3206 [inline]
          [<ffffffff81a716d3>] slab_alloc mm/slub.c:3214 [inline]
          [<ffffffff81a716d3>] kmem_cache_alloc_trace+0x163/0x370 mm/slub.c:3231
          [<ffffffff82e8681a>] kmalloc include/linux/slab.h:591 [inline]
          [<ffffffff82e8681a>] kzalloc include/linux/slab.h:721 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_group_create drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4918 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_new drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5054 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_event+0x59a/0x2910 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5239
          [<ffffffff813ef67d>] notifier_call_chain+0xbd/0x210 kernel/notifier.c:83
          [<ffffffff813f0662>] blocking_notifier_call_chain kernel/notifier.c:318 [inline]
          [<ffffffff813f0662>] blocking_notifier_call_chain+0x72/0xa0 kernel/notifier.c:306
          [<ffffffff8384b9c6>] call_nexthop_notifiers+0x156/0x310 net/ipv4/nexthop.c:244
          [<ffffffff83852bd8>] insert_nexthop net/ipv4/nexthop.c:2336 [inline]
          [<ffffffff83852bd8>] nexthop_add net/ipv4/nexthop.c:2644 [inline]
          [<ffffffff83852bd8>] rtm_new_nexthop+0x14e8/0x4d10 net/ipv4/nexthop.c:2913
          [<ffffffff833e9a78>] rtnetlink_rcv_msg+0x448/0xbf0 net/core/rtnetlink.c:5572
          [<ffffffff83608703>] netlink_rcv_skb+0x173/0x480 net/netlink/af_netlink.c:2504
          [<ffffffff833de032>] rtnetlink_rcv+0x22/0x30 net/core/rtnetlink.c:5590
          [<ffffffff836069de>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
          [<ffffffff836069de>] netlink_unicast+0x5ae/0x7f0 net/netlink/af_netlink.c:1340
          [<ffffffff83607501>] netlink_sendmsg+0x8e1/0xe30 net/netlink/af_netlink.c:1929
          [<ffffffff832fde84>] sock_sendmsg_nosec net/socket.c:704 [inline]
          [<ffffffff832fde84>] sock_sendmsg net/socket.c:724 [inline]
          [<ffffffff832fde84>] ____sys_sendmsg+0x874/0x9f0 net/socket.c:2409
          [<ffffffff83304a44>] ___sys_sendmsg+0x104/0x170 net/socket.c:2463
          [<ffffffff83304c01>] __sys_sendmsg+0x111/0x1f0 net/socket.c:2492
          [<ffffffff83304d5d>] __do_sys_sendmsg net/socket.c:2501 [inline]
          [<ffffffff83304d5d>] __se_sys_sendmsg net/socket.c:2499 [inline]
          [<ffffffff83304d5d>] __x64_sys_sendmsg+0x7d/0xc0 net/socket.c:2499
      
      Fixes: 2a014b20 ("mlxsw: spectrum_router: Add support for nexthop objects")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3106a084
  3. 22 9月, 2021 5 次提交
  4. 21 9月, 2021 6 次提交
    • V
      net: dsa: realtek: register the MDIO bus under devres · 74b6d7d1
      Vladimir Oltean 提交于
      The Linux device model permits both the ->shutdown and ->remove driver
      methods to get called during a shutdown procedure. Example: a DSA switch
      which sits on an SPI bus, and the SPI bus driver calls this on its
      ->shutdown method:
      
      spi_unregister_controller
      -> device_for_each_child(&ctlr->dev, NULL, __unregister);
         -> spi_unregister_device(to_spi_device(dev));
            -> device_del(&spi->dev);
      
      So this is a simple pattern which can theoretically appear on any bus,
      although the only other buses on which I've been able to find it are
      I2C:
      
      i2c_del_adapter
      -> device_for_each_child(&adap->dev, NULL, __unregister_client);
         -> i2c_unregister_device(client);
            -> device_unregister(&client->dev);
      
      The implication of this pattern is that devices on these buses can be
      unregistered after having been shut down. The drivers for these devices
      might choose to return early either from ->remove or ->shutdown if the
      other callback has already run once, and they might choose that the
      ->shutdown method should only perform a subset of the teardown done by
      ->remove (to avoid unnecessary delays when rebooting).
      
      So in other words, the device driver may choose on ->remove to not
      do anything (therefore to not unregister an MDIO bus it has registered
      on ->probe), because this ->remove is actually triggered by the
      device_shutdown path, and its ->shutdown method has already run and done
      the minimally required cleanup.
      
      This used to be fine until the blamed commit, but now, the following
      BUG_ON triggers:
      
      void mdiobus_free(struct mii_bus *bus)
      {
      	/* For compatibility with error handling in drivers. */
      	if (bus->state == MDIOBUS_ALLOCATED) {
      		kfree(bus);
      		return;
      	}
      
      	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
      	bus->state = MDIOBUS_RELEASED;
      
      	put_device(&bus->dev);
      }
      
      In other words, there is an attempt to free an MDIO bus which was not
      unregistered. The attempt to free it comes from the devres release
      callbacks of the SPI device, which are executed after the device is
      unregistered.
      
      I'm not saying that the fact that MDIO buses allocated using devres
      would automatically get unregistered wasn't strange. I'm just saying
      that the commit didn't care about auditing existing call paths in the
      kernel, and now, the following code sequences are potentially buggy:
      
      (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
          located on a bus that unregisters its children on shutdown. After
          the blamed patch, either both the alloc and the register should use
          devres, or none should.
      
      (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
          mdiobus_unregister at all in the remove path. After the blamed
          patch, nobody unregisters the MDIO bus anymore, so this is even more
          buggy than the previous case which needs a specific bus
          configuration to be seen, this one is an unconditional bug.
      
      In this case, the Realtek drivers fall under category (b). To solve it,
      we can register the MDIO bus under devres too, which restores the
      previous behavior.
      
      Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
      Reported-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reported-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74b6d7d1
    • F
      net: dsa: bcm_sf2: Request APD, DLL disable and IDDQ-SR · 4972ce72
      Florian Fainelli 提交于
      When interfacing with a Broadcom PHY, request the auto-power down, DLL
      disable and IDDQ-SR modes to be enabled.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4972ce72
    • F
      net: bcmgenet: Request APD, DLL disable and IDDQ-SR · c3a4c693
      Florian Fainelli 提交于
      When interfacing with a Broadcom PHY, request the auto-power down, DLL
      disable and IDDQ-SR modes to be enabled.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3a4c693
    • F
      net: phy: broadcom: Utilize appropriate suspend for BCM54810/11 · 72e78d22
      Florian Fainelli 提交于
      Since we enable APD and DLL/RXC/TXC disable we need to use
      bcm54xx_suspend() in order not to do a read/modify/write of the BMCR
      register which is incompatible with the desired settings.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72e78d22
    • F
      net: phy: broadcom: Wire suspend/resume for BCM50610 and BCM50610M · 38b6a907
      Florian Fainelli 提交于
      These two Ethernet PHYs support IDDQ-SR therefore wire-up the suspend
      and resume callbacks to point to bcm54xx_suspend() and bcm54xx_resume().
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38b6a907
    • F
      net: phy: broadcom: Add IDDQ-SR mode · d6da08ed
      Florian Fainelli 提交于
      Add support for putting the PHY into IDDQ Soft Recovery mode by setting
      the TOP_MISC register bits accordingly. This requires us to implement a
      custom bcm54xx_suspend() routine which diverges from genphy_suspend() in
      order to configure the PHY to enter IDDQ with software recovery as well
      as avoid doing a read/modify/write on the BMCR register.
      
      Doing a read/modify/write on the BMCR register means that the
      auto-negotation bit may remain which interferes with the ability to put
      the PHY into IDDQ-SR mode. We do software reset upon suspend in order to
      put the PHY back into its state prior to suspend as recommended by the
      datasheet.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6da08ed
  5. 20 9月, 2021 10 次提交