1. 24 9月, 2021 9 次提交
  2. 23 9月, 2021 11 次提交
    • S
      atlantic: Fix issue in the pm resume flow. · 4d88c339
      Sudarsana Reddy Kalluru 提交于
      After fixing hibernation resume flow, another usecase was found which
      should be explicitly handled - resume when device is in "down" state.
      Invoke aq_nic_init jointly with aq_nic_start only if ndev was already
      up during suspend/hibernate. We still need to perform nic_deinit() if
      caller requests for it, to handle the freeze/resume scenarios.
      
      Fixes: 57f780f1 ("atlantic: Fix driver resume flow.")
      Signed-off-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d88c339
    • A
      net/mlx4_en: Don't allow aRFS for encapsulated packets · fdbccea4
      Aya Levin 提交于
      Driver doesn't support aRFS for encapsulated packets, return early error
      in such a case.
      
      Fixes: 1eb8c695 ("net/mlx4_en: Add accelerated RFS support")
      Signed-off-by: NAya Levin <ayal@nvidia.com>
      Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdbccea4
    • V
      net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled · acc64f52
      Vladimir Oltean 提交于
      The blamed commit made the fatally incorrect assumption that ports which
      aren't in the FORWARDING STP state should not have packets forwarded
      towards them, and that is all that needs to be done.
      
      However, that logic alone permits BLOCKING ports to forward to
      FORWARDING ports, which of course allows packet storms to occur when
      there is an L2 loop.
      
      The ocelot_get_bridge_fwd_mask should not only ask "what can the bridge
      do for you", but "what can you do for the bridge". This way, only
      FORWARDING ports forward to the other FORWARDING ports from the same
      bridging domain, and we are still compatible with the idea of multiple
      bridges.
      
      Fixes: df291e54 ("net: ocelot: support multiple bridges")
      Suggested-by: NColin Foster <colin.foster@in-advantage.com>
      Reported-by: NColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acc64f52
    • F
      net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries · e68daf61
      Felix Fietkau 提交于
      Sometimes multiple CLS_REPLACE calls are issued for the same connection.
      rhashtable_insert_fast does not check for these duplicates, so multiple
      hardware flow entries can be created.
      Fix this by checking for an existing entry early
      
      Fixes: 502e84e2 ("net: ethernet: mtk_eth_soc: add flow offloading support")
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      Signed-off-by: NIlya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e68daf61
    • V
      net: dsa: sja1105: stop using priv->vlan_aware · 9aad3e4e
      Vladimir Oltean 提交于
      Now that the sja1105 driver is finally sane enough again to stop having
      a ternary VLAN awareness state, we can remove priv->vlan_aware and query
      DSA for the ds->vlan_filtering value (for SJA1105, VLAN filtering is a
      global property).
      
      Also drop the paranoid checking that DSA calls ->port_vlan_filtering
      multiple times without the VLAN awareness state changing. It doesn't,
      the same check is present inside dsa_port_vlan_filtering too.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9aad3e4e
    • M
      nfc: st-nci: Add SPI ID matching DT compatible · 31339440
      Mark Brown 提交于
      Currently autoloading for SPI devices does not use the DT ID table, it uses
      SPI modalises. Supporting OF modalises is going to be difficult if not
      impractical, an attempt was made but has been reverted, so ensure that
      module autoloading works for this driver by adding the part name used in
      the compatible to the list of SPI IDs.
      
      Fixes: 96c8395e ("spi: Revert modalias changes")
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31339440
    • V
      net: dsa: sja1105: don't keep a persistent reference to the reset GPIO · 33e1501f
      Vladimir Oltean 提交于
      The driver only needs the reset GPIO for a very brief period, so instead
      of using devres and keeping the descriptor pointer inside priv, just use
      that descriptor inside the sja1105_hw_reset function and then let go of
      it.
      
      Also use gpiod_get_optional while at it, and error out on real errors
      (bad flags etc).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33e1501f
    • V
      net: dsa: sja1105: break dependency between dsa_port_is_sja1105 and switch driver · f5aef424
      Vladimir Oltean 提交于
      It's nice to be able to test a tagging protocol with dsa_loop, but not
      at the cost of losing the ability of building the tagging protocol and
      switch driver as modules, because as things stand, there is a circular
      dependency between the two. Tagging protocol drivers cannot depend on
      switch drivers, that is a hard fact.
      
      The reasoning behind the blamed patch was that accessing dp->priv should
      first make sure that the structure behind that pointer is what we really
      think it is.
      
      Currently the "sja1105" and "sja1110" tagging protocols only operate
      with the sja1105 switch driver, just like any other tagging protocol and
      switch combination. The only way to mix and match them is by modifying
      the code, and this applies to dsa_loop as well (by default that uses
      DSA_TAG_PROTO_NONE). So while in principle there is an issue, in
      practice there isn't one.
      
      Until we extend dsa_loop to allow user space configuration, treat the
      problem as a non-issue and just say that DSA ports found by tag_sja1105
      are always sja1105 ports, which is in fact true. But keep the
      dsa_port_is_sja1105 function so that it's easy to patch it during
      testing, and rely on dead code elimination.
      
      Fixes: 994d2cbb ("net: dsa: tag_sja1105: be dsa_loop-safe")
      Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5aef424
    • V
      net: dsa: move sja1110_process_meta_tstamp inside the tagging protocol driver · 6d709cad
      Vladimir Oltean 提交于
      The problem is that DSA tagging protocols really must not depend on the
      switch driver, because this creates a circular dependency at insmod
      time, and the switch driver will effectively not load when the tagging
      protocol driver is missing.
      
      The code was structured in the way it was for a reason, though. The DSA
      driver-facing API for PTP timestamping relies on the assumption that
      two-step TX timestamps are provided by the hardware in an out-of-band
      manner, typically by raising an interrupt and making that timestamp
      available inside some sort of FIFO which is to be accessed over
      SPI/MDIO/etc.
      
      So the API puts .port_txtstamp into dsa_switch_ops, because it is
      expected that the switch driver needs to save some state (like put the
      skb into a queue until its TX timestamp arrives).
      
      On SJA1110, TX timestamps are provided by the switch as Ethernet
      packets, so this makes them be received and processed by the tagging
      protocol driver. This in itself is great, because the timestamps are
      full 64-bit and do not require reconstruction, and since Ethernet is the
      fastest I/O method available to/from the switch, PTP timestamps arrive
      very quickly, no matter how bottlenecked the SPI connection is, because
      SPI interaction is not needed at all.
      
      DSA's code structure and strict isolation between the tagging protocol
      driver and the switch driver break the natural code organization.
      
      When the tagging protocol driver receives a packet which is classified
      as a metadata packet containing timestamps, it passes those timestamps
      one by one to the switch driver, which then proceeds to compare them
      based on the recorded timestamp ID that was generated in .port_txtstamp.
      
      The communication between the tagging protocol and the switch driver is
      done through a method exported by the switch driver, sja1110_process_meta_tstamp.
      To satisfy build requirements, we force a dependency to build the
      tagging protocol driver as a module when the switch driver is a module.
      However, as explained in the first paragraph, that causes the circular
      dependency.
      
      To solve this, move the skb queue from struct sja1105_private :: struct
      sja1105_ptp_data to struct sja1105_private :: struct sja1105_tagger_data.
      The latter is a data structure for which hacks have already been put
      into place to be able to create persistent storage per switch that is
      accessible from the tagging protocol driver (see sja1105_setup_ports).
      
      With the skb queue directly accessible from the tagging protocol driver,
      we can now move sja1110_process_meta_tstamp into the tagging driver
      itself, and avoid exporting a symbol.
      
      Fixes: 566b18c8 ("net: dsa: sja1105: implement TX timestamping for SJA1110")
      Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d709cad
    • V
      net: dsa: sja1105: remove sp->dp · 68a81bb2
      Vladimir Oltean 提交于
      It looks like this field was never used since its introduction in commit
      227d07a0 ("net: dsa: sja1105: Add support for traffic through
      standalone ports") remove it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68a81bb2
    • I
      nexthop: Fix memory leaks in nexthop notification chain listeners · 3106a084
      Ido Schimmel 提交于
      syzkaller discovered memory leaks [1] that can be reduced to the
      following commands:
      
       # ip nexthop add id 1 blackhole
       # devlink dev reload pci/0000:06:00.0
      
      As part of the reload flow, mlxsw will unregister its netdevs and then
      unregister from the nexthop notification chain. Before unregistering
      from the notification chain, mlxsw will receive delete notifications for
      nexthop objects using netdevs registered by mlxsw or their uppers. mlxsw
      will not receive notifications for nexthops using netdevs that are not
      dismantled as part of the reload flow. For example, the blackhole
      nexthop above that internally uses the loopback netdev as its nexthop
      device.
      
      One way to fix this problem is to have listeners flush their nexthop
      tables after unregistering from the notification chain. This is
      error-prone as evident by this patch and also not symmetric with the
      registration path where a listener receives a dump of all the existing
      nexthops.
      
      Therefore, fix this problem by replaying delete notifications for the
      listener being unregistered. This is symmetric to the registration path
      and also consistent with the netdev notification chain.
      
      The above means that unregister_nexthop_notifier(), like
      register_nexthop_notifier(), will have to take RTNL in order to iterate
      over the existing nexthops and that any callers of the function cannot
      hold RTNL. This is true for mlxsw and netdevsim, but not for the VXLAN
      driver. To avoid a deadlock, change the latter to unregister its nexthop
      listener without holding RTNL, making it symmetric to the registration
      path.
      
      [1]
      unreferenced object 0xffff88806173d600 (size 512):
        comm "syz-executor.0", pid 1290, jiffies 4295583142 (age 143.507s)
        hex dump (first 32 bytes):
          41 9d 1e 60 80 88 ff ff 08 d6 73 61 80 88 ff ff  A..`......sa....
          08 d6 73 61 80 88 ff ff 01 00 00 00 00 00 00 00  ..sa............
        backtrace:
          [<ffffffff81a6b576>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<ffffffff81a6b576>] slab_post_alloc_hook+0x96/0x490 mm/slab.h:522
          [<ffffffff81a716d3>] slab_alloc_node mm/slub.c:3206 [inline]
          [<ffffffff81a716d3>] slab_alloc mm/slub.c:3214 [inline]
          [<ffffffff81a716d3>] kmem_cache_alloc_trace+0x163/0x370 mm/slub.c:3231
          [<ffffffff82e8681a>] kmalloc include/linux/slab.h:591 [inline]
          [<ffffffff82e8681a>] kzalloc include/linux/slab.h:721 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_group_create drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4918 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_new drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5054 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_event+0x59a/0x2910 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5239
          [<ffffffff813ef67d>] notifier_call_chain+0xbd/0x210 kernel/notifier.c:83
          [<ffffffff813f0662>] blocking_notifier_call_chain kernel/notifier.c:318 [inline]
          [<ffffffff813f0662>] blocking_notifier_call_chain+0x72/0xa0 kernel/notifier.c:306
          [<ffffffff8384b9c6>] call_nexthop_notifiers+0x156/0x310 net/ipv4/nexthop.c:244
          [<ffffffff83852bd8>] insert_nexthop net/ipv4/nexthop.c:2336 [inline]
          [<ffffffff83852bd8>] nexthop_add net/ipv4/nexthop.c:2644 [inline]
          [<ffffffff83852bd8>] rtm_new_nexthop+0x14e8/0x4d10 net/ipv4/nexthop.c:2913
          [<ffffffff833e9a78>] rtnetlink_rcv_msg+0x448/0xbf0 net/core/rtnetlink.c:5572
          [<ffffffff83608703>] netlink_rcv_skb+0x173/0x480 net/netlink/af_netlink.c:2504
          [<ffffffff833de032>] rtnetlink_rcv+0x22/0x30 net/core/rtnetlink.c:5590
          [<ffffffff836069de>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
          [<ffffffff836069de>] netlink_unicast+0x5ae/0x7f0 net/netlink/af_netlink.c:1340
          [<ffffffff83607501>] netlink_sendmsg+0x8e1/0xe30 net/netlink/af_netlink.c:1929
          [<ffffffff832fde84>] sock_sendmsg_nosec net/socket.c:704 [inline]
          [<ffffffff832fde84>] sock_sendmsg net/socket.c:724 [inline]
          [<ffffffff832fde84>] ____sys_sendmsg+0x874/0x9f0 net/socket.c:2409
          [<ffffffff83304a44>] ___sys_sendmsg+0x104/0x170 net/socket.c:2463
          [<ffffffff83304c01>] __sys_sendmsg+0x111/0x1f0 net/socket.c:2492
          [<ffffffff83304d5d>] __do_sys_sendmsg net/socket.c:2501 [inline]
          [<ffffffff83304d5d>] __se_sys_sendmsg net/socket.c:2499 [inline]
          [<ffffffff83304d5d>] __x64_sys_sendmsg+0x7d/0xc0 net/socket.c:2499
      
      Fixes: 2a014b20 ("mlxsw: spectrum_router: Add support for nexthop objects")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3106a084
  3. 22 9月, 2021 9 次提交
    • S
      qed: rdma - don't wait for resources under hw error recovery flow · 1ea78123
      Shai Malin 提交于
      If the HW device is during recovery, the HW resources will never return,
      hence we shouldn't wait for the CID (HW context ID) bitmaps to clear.
      This fix speeds up the error recovery flow.
      
      Fixes: 64515dc8 ("qed: Add infrastructure for error detection and recovery")
      Signed-off-by: NMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NShai Malin <smalin@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ea78123
    • I
      mlxsw: spectrum_router: Start using new trap adjacency entry · e3a3aae7
      Ido Schimmel 提交于
      Start using the trap adjacency entry that was added in the previous
      patch and remove the existing one which is no longer needed.
      
      Note that the name of the old entry was inaccurate as the entry did not
      discard packets, but trapped them.
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3a3aae7
    • I
      mlxsw: spectrum_router: Add trap adjacency entry upon first nexthop group · 4bdf80bc
      Ido Schimmel 提交于
      In commit 0c3cbbf9 ("mlxsw: Add specific trap for packets routed via
      invalid nexthops"), mlxsw started allocating a new adjacency entry
      during driver initialization, to trap packets routed via invalid
      nexthops.
      
      This behavior was later altered in commit 983db619 ("mlxsw:
      spectrum_router: Allocate discard adjacency entry when needed") to only
      allocate the entry upon the first route that requires it. The motivation
      for the change is explained in the commit message.
      
      The problem with the current behavior is that the entry shows up as a
      "leak" in a new BPF resource monitoring tool [1]. This is caused by the
      asymmetry of the allocation/free scheme. While the entry is allocated
      upon the first route that requires it, it is only freed during
      de-initialization of the driver.
      
      Instead, track the number of active nexthop groups and allocate the
      adjacency entry upon the creation of the first group. Free it when the
      number of active groups reaches zero.
      
      The next patch will convert mlxsw to start using the new entry and
      remove the old one.
      
      [1] https://github.com/Mellanox/mlxsw/tree/master/Debugging/libbpf-tools/resmonSigned-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bdf80bc
    • M
      net: wwan: iosm: fw flashing and cd improvements · 8bea96ef
      M Chetan Kumar 提交于
      1> Function comments moved to .c file.
      2> Use literals in return to improve readability.
      3> Do error handling check instead of success check.
      4> Redundant ret assignment removed.
      Signed-off-by: NM Chetan Kumar <m.chetan.kumar@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bea96ef
    • L
      devlink: Make devlink_register to be void · db4278c5
      Leon Romanovsky 提交于
      devlink_register() can't fail and always returns success, but all drivers
      are obligated to check returned status anyway. This adds a lot of boilerplate
      code to handle impossible flow.
      
      Make devlink_register() void and simplify the drivers that use that
      API call.
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Acked-by: NSimon Horman <simon.horman@corigine.com>
      Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db4278c5
    • A
      s390/qeth: fix deadlock during failing recovery · d2b59bd4
      Alexandra Winter 提交于
      Commit 0b9902c1 ("s390/qeth: fix deadlock during recovery") removed
      taking discipline_mutex inside qeth_do_reset(), fixing potential
      deadlocks. An error path was missed though, that still takes
      discipline_mutex and thus has the original deadlock potential.
      
      Intermittent deadlocks were seen when a qeth channel path is configured
      offline, causing a race between qeth_do_reset and ccwgroup_remove.
      Call qeth_set_offline() directly in the qeth_do_reset() error case and
      then a new variant of ccwgroup_set_offline(), without taking
      discipline_mutex.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d2b59bd4
    • A
      s390/qeth: Fix deadlock in remove_discipline · ee909d0b
      Alexandra Winter 提交于
      Problem: qeth_close_dev_handler is a worker that tries to acquire
      card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline().
      Since commit b41b554c
      ("s390/qeth: fix locking for discipline setup / removal")
      qeth_remove_discipline() is called under card->discipline_mutex and
      cancels the work and waits for it to finish.
      
      STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the
      only situation that schedules close_dev_work. In that situation scheduling
      qeth recovery will also result in an offline interface, when resetting the
      isolation mode fails, if the external switch is still set to VEB.
      And since commit 0b9902c1 ("s390/qeth: fix deadlock during recovery")
      qeth recovery does not aquire card->discipline_mutex anymore.
      
      So we accept the longer pathlength of qeth_schedule_recovery in this
      error situation and re-use the existing function.
      
      As a side-benefit this changes the hwtrap to behave like during recovery
      instead of like during a user-triggered set_offline.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com>
      Acked-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ee909d0b
    • J
      s390/qeth: fix NULL deref in qeth_clear_working_pool_list() · 248f064a
      Julian Wiedmann 提交于
      When qeth_set_online() calls qeth_clear_working_pool_list() to roll
      back after an error exit from qeth_hardsetup_card(), we are at risk of
      accessing card->qdio.in_q before it was allocated by
      qeth_alloc_qdio_queues() via qeth_mpc_initialize().
      
      qeth_clear_working_pool_list() then dereferences NULL, and by writing to
      queue->bufs[i].pool_entry scribbles all over the CPU's lowcore.
      Resulting in a crash when those lowcore areas are used next (eg. on
      the next machine-check interrupt).
      
      Such a scenario would typically happen when the device is first set
      online and its queues aren't allocated yet. An early IO error or certain
      misconfigs (eg. mismatched transport mode, bad portno) then cause us to
      error out from qeth_hardsetup_card() with card->qdio.in_q still being
      NULL.
      
      Fix it by checking the pointer for NULL before accessing it.
      
      Note that we also have (rare) paths inside qeth_mpc_initialize() where
      a configuration change can cause us to free the existing queues,
      expecting that subsequent code will allocate them again. If we then
      error out before that re-allocation happens, the same bug occurs.
      
      Fixes: eff73e16 ("s390/qeth: tolerate pre-filled RX buffer")
      Reported-by: NStefan Raspl <raspl@linux.ibm.com>
      Root-caused-by: NHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: NAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      248f064a
    • M
      spi: Revert modalias changes · 96c8395e
      Mark Brown 提交于
      During the v5.13 cycle we updated the SPI subsystem to generate OF style
      modaliases for SPI devices, replacing the old Linux style modalises we
      used to generate based on spi_device_id which are the DT style name with
      the vendor removed.  Unfortunately this means that we start only
      reporting OF style modalises and not the old ones and there is nothing
      that ensures that drivers list every possible OF compatible string in
      their OF ID table.  The result is that there are systems which have been
      relying on loading modules based on the old style that are now broken,
      as found by Russell King with spi-nor on Macchiatobin.
      
      spi-nor is a particularly problematic case for this, it only lists a
      single generic DT compatible jedec,spi-nor in the driver but supports a
      huge raft of device specific compatibles, with a large set of part
      numbers many of which are offered by multiple vendors.  Russell's
      searches of upstream device trees has turned up examples with vendor
      names written in non-standard ways too.  To make matters worse up until
      8ff16cf7 ("Documentation: devicetree: m25p80: add "nor-jedec"
      binding") the generic compatible was not part of the binding so there
      are device trees out there written to that binding version which don't
      list it all.  The sheer number of parts supported together with our
      previous approach of ignoring the vendor ID makes robustly fixing this
      by adding compatibles to the spi-nor driver seem problematic, the
      current DT binding document does not list all the parts supported by the
      driver at the minute (further patches will fix this).
      
      I've also investigated supporting both formats of modalias
      simultaneously but that doesn't seem possible, especially without
      breaking our userspace ABI which is obviously not viable.
      
      Instead revert the relevant changes for now:
      
      e09f2ab8 ("spi: update modalias_show after of_device_uevent_modalias support")
      3ce6c9e2 ("spi: add of_device_uevent_modalias support")
      
      This will unfortunately mean that any system which had started having
      modules autoload based on the OF compatibles for drivers that list
      things there but not in the spi_device_ids will now not have those
      modules load which is itself a regression.  Since it affects a narrower
      time window and the particularly problematic spi-nor driver may be
      critical to system boot on smaller systems this seems the best of a
      series of bad options.  I will start an audit of SPI drivers to identify
      and fix cases where things won't autoload using spi_device_id, this is
      not great but seems to be the best way forward that anyone has been able
      to identify.
      
      Thanks to Russell for both his report and the additional diagnostic and
      analysis work he has done here, the detailed research above was his
      work.
      
      Fixes: e09f2ab8 ("spi: update modalias_show after of_device_uevent_modalias support")
      Fixes: 3ce6c9e2 ("spi: add of_device_uevent_modalias support")
      Reported-by: NRussell King (Oracle) <linux@armlinux.org.uk>
      Suggested-by: NRussell King (Oracle) <linux@armlinux.org.uk>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Tested-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Cc: Andreas Schwab <schwab@suse.de>
      Cc: Marco Felsch <m.felsch@pengutronix.de>
      96c8395e
  4. 21 9月, 2021 10 次提交
  5. 20 9月, 2021 1 次提交