1. 27 5月, 2020 5 次提交
  2. 26 5月, 2020 2 次提交
    • Q
      qlcnic: fix missing release in qlcnic_83xx_interrupt_test. · 15c97385
      Qiushi Wu 提交于
      In function qlcnic_83xx_interrupt_test(), function
      qlcnic_83xx_diag_alloc_res() is not handled by function
      qlcnic_83xx_diag_free_res() after a call of the function
      qlcnic_alloc_mbx_args() failed. Fix this issue by adding
      a jump target "fail_mbx_args", and jump to this new target
      when qlcnic_alloc_mbx_args() failed.
      
      Fixes: b6b4316c ("qlcnic: Handle qlcnic_alloc_mbx_args() failure")
      Signed-off-by: NQiushi Wu <wu000273@umn.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15c97385
    • V
      dpaa_eth: fix usage as DSA master, try 3 · 5d14c304
      Vladimir Oltean 提交于
      The dpaa-eth driver probes on compatible string for the MAC node, and
      the fman/mac.c driver allocates a dpaa-ethernet platform device that
      triggers the probing of the dpaa-eth net device driver.
      
      All of this is fine, but the problem is that the struct device of the
      dpaa_eth net_device is 2 parents away from the MAC which can be
      referenced via of_node. So of_find_net_device_by_node can't find it, and
      DSA switches won't be able to probe on top of FMan ports.
      
      It would be a bit silly to modify a core function
      (of_find_net_device_by_node) to look for dev->parent->parent->of_node
      just for one driver. We're just 1 step away from implementing full
      recursion.
      
      Actually there have already been at least 2 previous attempts to make
      this work:
      - Commit a1a50c8e ("fsl/man: Inherit parent device and of_node")
      - One or more of the patches in "[v3,0/6] adapt DPAA drivers for DSA":
        https://patchwork.ozlabs.org/project/netdev/cover/1508178970-28945-1-git-send-email-madalin.bucur@nxp.com/
        (I couldn't really figure out which one was supposed to solve the
        problem and how).
      
      Point being, it looks like this is still pretty much a problem today.
      On T1040, the /sys/class/net/eth0 symlink currently points to
      
      ../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/dpaa-ethernet.0/net/eth0
      
      which pretty much illustrates the problem. The closest of_node we've got
      is the "fsl,fman-memac" at /soc@ffe000000/fman@400000/ethernet@e6000,
      which is what we'd like to be able to reference from DSA as host port.
      
      For of_find_net_device_by_node to find the eth0 port, we would need the
      parent of the eth0 net_device to not be the "dpaa-ethernet" platform
      device, but to point 1 level higher, aka the "fsl,fman-memac" node
      directly. The new sysfs path would look like this:
      
      ../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/net/eth0
      
      And this is exactly what SET_NETDEV_DEV does. It sets the parent of the
      net_device. The new parent has an of_node associated with it, and
      of_dev_node_match already checks for the of_node of the device or of its
      parent.
      
      Fixes: a1a50c8e ("fsl/man: Inherit parent device and of_node")
      Fixes: c6e26ea8 ("dpaa_eth: change device used")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d14c304
  3. 24 5月, 2020 6 次提交
  4. 23 5月, 2020 22 次提交
    • S
      net/mlx5: Fix error flow in case of function_setup failure · 4f7400d5
      Shay Drory 提交于
      Currently, if an error occurred during mlx5_function_setup(), we
      keep dev->state as DEVICE_STATE_UP.
      Fixing it by adding a goto label.
      
      Fixes: e161105e ("net/mlx5: Function setup/teardown procedures")
      Signed-off-by: NShay Drory <shayd@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4f7400d5
    • R
      net/mlx5e: CT: Correctly get flow rule · d37bd5e8
      Roi Dayan 提交于
      The correct way is to us the flow_cls_offload_flow_rule() wrapper
      instead of f->rule directly.
      
      Fixes: 4c3844d9 ("net/mlx5e: CT: Introduce connection tracking")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NOz Shlomo <ozsh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      d37bd5e8
    • M
      net/mlx5e: Update netdev txq on completions during closure · 5e911e2c
      Moshe Shemesh 提交于
      On sq closure when we free its descriptors, we should also update netdev
      txq on completions which would not arrive. Otherwise if we reopen sqs
      and attach them back, for example on fw fatal recovery flow, we may get
      tx timeout.
      
      Fixes: 29429f33 ("net/mlx5e: Timeout if SQ doesn't flush during close")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5e911e2c
    • R
      net/mlx5: Annotate mutex destroy for root ns · 9ca41539
      Roi Dayan 提交于
      Invoke mutex_destroy() to catch any errors.
      
      Fixes: 2cc43b49 ("net/mlx5_core: Managing root flow table")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9ca41539
    • R
      net/mlx5: Don't maintain a case of del_sw_func being null · 6eb7a268
      Roi Dayan 提交于
      Add del_sw_func cb for root ns. Now there is no need to
      maintain a case of del_sw_func being null when freeing the node.
      
      Fixes: 2cc43b49 ("net/mlx5_core: Managing root flow table")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Reviewed-by: NPaul Blakey <paulb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6eb7a268
    • R
      net/mlx5: Fix cleaning unmanaged flow tables · aee37f3d
      Roi Dayan 提交于
      Unmanaged flow tables doesn't have a parent and tree_put_node()
      assume there is always a parent if cleaning is needed. fix that.
      
      Fixes: 5281a0c9 ("net/mlx5: fs_core: Introduce unmanaged flow tables")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NMark Bloch <markb@mellanox.com>
      Reviewed-by: NPaul Blakey <paulb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      aee37f3d
    • M
      net/mlx5: Fix memory leak in mlx5_events_init · df14ad1e
      Moshe Shemesh 提交于
      Fix memory leak in mlx5_events_init(), in case
      create_single_thread_workqueue() fails, events
      struct should be freed.
      
      Fixes: 5d3c537f ("net/mlx5: Handle event of power detection in the PCIE slot")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      df14ad1e
    • R
      net/mlx5e: Fix inner tirs handling · a16b8e0d
      Roi Dayan 提交于
      In the cited commit inner_tirs argument was added to create and destroy
      inner tirs, and no indication was added to mlx5e_modify_tirs_hash()
      function. In order to have a consistent handling, use
      inner_indir_tir[0].tirn in tirs destroy/modify function as an indication
      to whether inner tirs are created.
      Inner tirs are not created for representors and before this commit,
      a call to mlx5e_modify_tirs_hash() was sending HW commands to
      modify non-existent inner tirs.
      
      Fixes: 46dc933c ("net/mlx5e: Provide explicit directive if to create inner indirect tirs")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      a16b8e0d
    • T
      net/mlx5e: kTLS, Destroy key object after destroying the TIS · 16736e11
      Tariq Toukan 提交于
      The TLS TIS object contains the dek/key ID.
      By destroying the key first, the TIS would contain an invalid
      non-existing key ID.
      Reverse the destroy order, this also acheives the desired assymetry
      between the destroy and the create flows.
      
      Fixes: d2ead1f3 ("net/mlx5e: Add kTLS TX HW offload support")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      16736e11
    • M
      net/mlx5e: Fix allowed tc redirect merged eswitch offload cases · 32134847
      Maor Dickman 提交于
      After changing the parent_id to be the same for both NICs of same
      The cited commit wrongly allow offload of tc redirect flows from
      VF to uplink and vice versa when devcies are on different eswitch,
      these cases aren't supported by HW.
      
      Disallow the above offloads when devcies are on different eswitch
      and VF LAG is not configured.
      
      Fixes: f6dc1264 ("net/mlx5e: Disallow tc redirect offload cases we don't support")
      Signed-off-by: NMaor Dickman <maord@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      32134847
    • E
      net/mlx5: Avoid processing commands before cmdif is ready · f7936ddd
      Eran Ben Elisha 提交于
      When driver is reloading during recovery flow, it can't get new commands
      till command interface is up again. Otherwise we may get to null pointer
      trying to access non initialized command structures.
      
      Add cmdif state to avoid processing commands while cmdif is not ready.
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f7936ddd
    • E
      net/mlx5: Fix a race when moving command interface to events mode · d43b7007
      Eran Ben Elisha 提交于
      After driver creates (via FW command) an EQ for commands, the driver will
      be informed on new commands completion by EQE. However, due to a race in
      driver's internal command mode metadata update, some new commands will
      still be miss-handled by driver as if we are in polling mode. Such commands
      can get two non forced completion, leading to already freed command entry
      access.
      
      CREATE_EQ command, that maps EQ to the command queue must be posted to the
      command queue while it is empty and no other command should be posted.
      
      Add SW mechanism that once the CREATE_EQ command is about to be executed,
      all other commands will return error without being sent to the FW. Allow
      sending other commands only after successfully changing the driver's
      internal command mode metadata.
      We can safely return error to all other commands while creating the command
      EQ, as all other commands might be sent from the user/application during
      driver load. Application can rerun them later after driver's load was
      finished.
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      d43b7007
    • M
      net/mlx5: Add command entry handling completion · 17d00e83
      Moshe Shemesh 提交于
      When FW response to commands is very slow and all command entries in
      use are waiting for completion we can have a race where commands can get
      timeout before they get out of the queue and handled. Timeout
      completion on uninitialized command will cause releasing command's
      buffers before accessing it for initialization and then we will get NULL
      pointer exception while trying access it. It may also cause releasing
      buffers of another command since we may have timeout completion before
      even allocating entry index for this command.
      Add entry handling completion to avoid this race.
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      17d00e83
    • Q
      net: sun: fix missing release regions in cas_init_one(). · 5a730153
      Qiushi Wu 提交于
      In cas_init_one(), "pdev" is requested by "pci_request_regions", but it
      was not released after a call of the function “pci_write_config_byte”
      failed. Thus replace the jump target “err_write_cacheline” by
      "err_out_free_res".
      
      Fixes: 1f26dac3 ("[NET]: Add Sun Cassini driver.")
      Signed-off-by: NQiushi Wu <wu000273@umn.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a730153
    • V
      net: mscc: ocelot: fix address ageing time (again) · bf655ba2
      Vladimir Oltean 提交于
      ocelot_set_ageing_time has 2 callers:
       - felix_set_ageing_time: from drivers/net/dsa/ocelot/felix.c
       - ocelot_port_attr_ageing_set: from drivers/net/ethernet/mscc/ocelot.c
      
      The issue described in the fixed commit below actually happened for the
      felix_set_ageing_time code path only, since ocelot_port_attr_ageing_set
      was already dividing by 1000. So to make both paths symmetrical (and to
      fix addresses getting aged way too fast on Ocelot), stop dividing by
      1000 at caller side altogether.
      
      Fixes: c0d7eccb ("net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf655ba2
    • H
      r8169: fix OCP access on RTL8117 · 561535b0
      Heiner Kallweit 提交于
      According to r8168 vendor driver DASHv3 chips like RTL8168fp/RTL8117
      need a special addressing for OCP access.
      Fix is compile-tested only due to missing test hardware.
      
      Fixes: 1287723a ("r8169: add support for RTL8117")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      561535b0
    • J
      mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails · 4340f42f
      Jiri Pirko 提交于
      In case of reload fail, the mlxsw_sp->ports contains a pointer to a
      freed memory (either by reload_down() or reload_up() error path).
      Fix this by initializing the pointer to NULL and checking it before
      dereferencing in split/unsplit/type_set callpaths.
      
      Fixes: 24cc68ad ("mlxsw: core: Add support for reload")
      Reported-by: NDanielle Ratson <danieller@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4340f42f
    • J
      net: ethernet: stmmac: Enable interface clocks on probe for IPQ806x · a96ac8a0
      Jonathan McDowell 提交于
      The ipq806x_gmac_probe() function enables the PTP clock but not the
      appropriate interface clocks. This means that if the bootloader hasn't
      done so attempting to bring up the interface will fail with an error
      like:
      
      [   59.028131] ipq806x-gmac-dwmac 37600000.ethernet: Failed to reset the dma
      [   59.028196] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_hw_setup: DMA engine initialization failed
      [   59.034056] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_open: Hw setup failed
      
      This patch, a slightly cleaned up version of one posted by Sergey
      Sergeev in:
      
      https://forum.openwrt.org/t/support-for-mikrotik-rb3011uias-rm/4064/257
      
      correctly enables the clock; we have already configured the source just
      before this.
      
      Tested on a MikroTik RB3011.
      Signed-off-by: NJonathan McDowell <noodles@earth.li>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a96ac8a0
    • I
      netdevsim: Ensure policer drop counter always increases · be43224f
      Ido Schimmel 提交于
      In case the policer drop counter is retrieved when the jiffies value is
      a multiple of 64, the counter will not be incremented.
      
      This randomly breaks a selftest [1] the reads the counter twice and
      checks that it was incremented:
      
      ```
      TEST: Trap policer                                                  [FAIL]
      	Policer drop counter was not incremented
      ```
      
      Fix by always incrementing the counter by 1.
      
      [1] tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh
      
      Fixes: ad188458 ("netdevsim: Add devlink-trap policer support")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be43224f
    • V
      net/ethernet/freescale: rework quiesce/activate for ucc_geth · 79dde73c
      Valentin Longchamp 提交于
      ugeth_quiesce/activate are used to halt the controller when there is a
      link change that requires to reconfigure the mac.
      
      The previous implementation called netif_device_detach(). This however
      causes the initial activation of the netdevice to fail precisely because
      it's detached. For details, see [1].
      
      A possible workaround was the revert of commit
      net: linkwatch: add check for netdevice being present to linkwatch_do_dev
      However, the check introduced in the above commit is correct and shall be
      kept.
      
      The netif_device_detach() is thus replaced with
      netif_tx_stop_all_queues() that prevents any tranmission. This allows to
      perform mac config change required by the link change, without detaching
      the corresponding netdevice and thus not preventing its initial
      activation.
      
      [1] https://lists.openwall.net/netdev/2020/01/08/201Signed-off-by: NValentin Longchamp <valentin@longchamp.me>
      Acked-by: NMatteo Ghidoni <matteo.ghidoni@ch.abb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79dde73c
    • R
      net: mvpp2: fix RX hashing for non-10G ports · 3138a07c
      Russell King 提交于
      When rxhash is enabled on any ethernet port except the first in each CP
      block, traffic flow is prevented.  The analysis is below:
      
      I've been investigating this afternoon, and what I've found, comparing
      a kernel without 895586d5 and with 895586d5 applied is:
      
      - The table programmed into the hardware via mvpp22_rss_fill_table()
        appears to be identical with or without the commit.
      
      - When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
        that c2.attr[0] and c2.attr[2] are written back containing:
      
         - with 895586d5, failing:    00200000 40000000
         - without 895586d5, working: 04000000 40000000
      
      - When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:
      
         04000000 00000000
      
      The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
      first value is the queue number, which comprises two fields. The high
      5 bits are 24:29 and the low three are 21:23 inclusive. This comes
      from:
      
             c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
                           MVPP22_CLS_C2_ATTR0_QLOW(ql);
      
      So, the working case gives eth2 a queue id of 4.0, or 32 as per
      port->first_rxq, and the non-working case a queue id of 0.1, or 1.
      The allocation of queue IDs seems to be in mvpp2_port_probe():
      
              if (priv->hw_version == MVPP21)
                      port->first_rxq = port->id * port->nrxqs;
              else
                      port->first_rxq = port->id * priv->max_port_rxqs;
      
      Where:
      
              if (priv->hw_version == MVPP21)
                      priv->max_port_rxqs = 8;
              else
                      priv->max_port_rxqs = 32;
      
      Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
      (eth2) be 32. It seems the idea is that the first 32 queues belong to
      port 0, the second 32 queues belong to port 1, etc.
      
      mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
      'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns
      port->rss_ctx[0].
      
      mvpp22_rss_context_create() is responsible for allocating that, which
      it does by looking for an unallocated priv->rss_tables[] pointer. This
      table is shared amongst all ports on the CP silicon.
      
      When we write the tables in mvpp22_rss_fill_table(), the RSS table
      entry is defined by:
      
                      u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
                                MVPP22_RSS_INDEX_TABLE_ENTRY(i);
      
      where rss_ctx is the context ID (queue number) and i is the index in
      the table.
      
      If we look at what is written:
      
      - The first table to be written has "sel" values of 00000000..0000001f,
        containing values 0..3. This appears to be for eth1. This is table 0,
        RX queue number 0.
      - The second table has "sel" values of 00000100..0000011f, and appears
        to be for eth2.  These contain values 0x20..0x23. This is table 1,
        RX queue number 0.
      - The third table has "sel" values of 00000200..0000021f, and appears
        to be for eth3.  These contain values 0x40..0x43. This is table 2,
        RX queue number 0.
      
      How do queue numbers translate to the RSS table?  There is another
      table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field
      of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
      register. Before 895586d5, it was:
      
             mvpp2_write(priv, MVPP22_RSS_INDEX,
                         MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
             mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
                         MVPP22_RSS_TABLE_POINTER(port->id));
      
      and after:
      
             mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
             mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));
      
      Before the commit, for eth2, that would've contained '32' for the
      index and '1' for the table pointer - mapping queue 32 to table 1.
      Remember that this is queue-high.queue-low of 4.0.
      
      After the commit, we appear to map queue 1 to table 1. That again
      looks fine on the face of it.
      
      Section 9.3.1 of the A8040 manual seems indicate the reason that the
      queue number is separated. queue-low seems to always come from the
      classifier, whereas queue-high can be from the ingress physical port
      number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.
      
      We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
      comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to
      be where our bug comes from.
      
      mvpp2_cls_oversize_rxq_set() sets this up as:
      
              mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id),
                          (port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));
      
              val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
              val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
              mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);
      
      Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high
      for eth2 is _always_ 4, so only queues 32 through 39 inclusive are
      available to eth2. Yet, we're trying to tell the classifier to set
      queue-high, which will be ignored, to zero. Hence, the queue-high
      field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be
      ignored.
      
      This means we end up directing traffic from eth2 not to queue 1, but
      to queue 33, and then we tell it to look up queue 33 in the RSS table.
      However, RSS table has not been programmed for queue 33, and so it ends
      up (presumably) dropping the packets.
      
      It seems that mvpp22_rss_context_create() doesn't take account of the
      fact that the upper 5 bits of the queue ID can't actually be changed
      due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that
      mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either
      way, these two functions mutually disagree with what queue number
      should be used.
      
      Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU
      validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used
      for over-sized packets attempting to egress through this port. With
      the classifier having had RSS enabled and directing eth2 traffic to
      queue 1, we may still have packets appearing on queue 32 for this port.
      
      However, the only way we may end up with over-sized packets attempting
      to egress through eth2 - is if the A8040 forwards frames between its
      ports. From what I can see, we don't support that feature, and the
      kernel restricts the egress packet size to the MTU. In any case, if we
      were to attempt to transmit an oversized packet, we have no support in
      the kernel to deal with that appearing in the port's receive queue.
      
      So, this patch attempts to solve the issue by clearing the
      MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH()
      from the classifier to define the queue-high field of the queue number.
      
      My testing seems to confirm my findings above - clearing this bit
      means that if I enable rxhash on eth2, the interface can then pass
      traffic, as we are now directing traffic to RX queue 1 rather than
      queue 33. Traffic still seems to work with rxhash off as well.
      Reported-by: NMatteo Croce <mcroce@redhat.com>
      Tested-by: NMatteo Croce <mcroce@redhat.com>
      Fixes: 895586d5 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3138a07c
    • C
      felix: Fix initialization of ioremap resources · b4024c9e
      Claudiu Manoil 提交于
      The caller of devm_ioremap_resource(), either accidentally
      or by wrong assumption, is writing back derived resource data
      to global static resource initialization tables that should
      have been constant.  Meaning that after it computes the final
      physical start address it saves the address for no reason
      in the static tables.  This doesn't affect the first driver
      probing after reboot, but it breaks consecutive driver reloads
      (i.e. driver unbind & bind) because the initialization tables
      no longer have the correct initial values.  So the next probe()
      will map the device registers to wrong physical addresses,
      causing ARM SError async exceptions.
      This patch fixes all of the above.
      
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
      Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4024c9e
  5. 22 5月, 2020 3 次提交
  6. 21 5月, 2020 2 次提交
    • J
      wireguard: noise: separate receive counter from send counter · a9e90d99
      Jason A. Donenfeld 提交于
      In "wireguard: queueing: preserve flow hash across packet scrubbing", we
      were required to slightly increase the size of the receive replay
      counter to something still fairly small, but an increase nonetheless.
      It turns out that we can recoup some of the additional memory overhead
      by splitting up the prior union type into two distinct types. Before, we
      used the same "noise_counter" union for both sending and receiving, with
      sending just using a simple atomic64_t, while receiving used the full
      replay counter checker. This meant that most of the memory being
      allocated for the sending counter was being wasted. Since the old
      "noise_counter" type increased in size in the prior commit, now is a
      good time to split up that union type into a distinct "noise_replay_
      counter" for receiving and a boring atomic64_t for sending, each using
      neither more nor less memory than required.
      
      Also, since sometimes the replay counter is accessed without
      necessitating additional accesses to the bitmap, we can reduce cache
      misses by hoisting the always-necessary lock above the bitmap in the
      struct layout. We also change a "noise_replay_counter" stack allocation
      to kmalloc in a -DDEBUG selftest so that KASAN doesn't trigger a stack
      frame warning.
      
      All and all, removing a bit of abstraction in this commit makes the code
      simpler and smaller, in addition to the motivating memory usage
      recuperation. For example, passing around raw "noise_symmetric_key"
      structs is something that really only makes sense within noise.c, in the
      one place where the sending and receiving keys can safely be thought of
      as the same type of object; subsequent to that, it's important that we
      uniformly access these through keypair->{sending,receiving}, where their
      distinct roles are always made explicit. So this patch allows us to draw
      that distinction clearly as well.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9e90d99
    • J
      wireguard: queueing: preserve flow hash across packet scrubbing · c78a0b4a
      Jason A. Donenfeld 提交于
      It's important that we clear most header fields during encapsulation and
      decapsulation, because the packet is substantially changed, and we don't
      want any info leak or logic bug due to an accidental correlation. But,
      for encapsulation, it's wrong to clear skb->hash, since it's used by
      fq_codel and flow dissection in general. Without it, classification does
      not proceed as usual. This change might make it easier to estimate the
      number of innerflows by examining clustering of out of order packets,
      but this shouldn't open up anything that can't already be inferred
      otherwise (e.g. syn packet size inference), and fq_codel can be disabled
      anyway.
      
      Furthermore, it might be the case that the hash isn't used or queried at
      all until after wireguard transmits the encrypted UDP packet, which
      means skb->hash might still be zero at this point, and thus no hash
      taken over the inner packet data. In order to address this situation, we
      force a calculation of skb->hash before encrypting packet data.
      
      Of course this means that fq_codel might transmit packets slightly more
      out of order than usual. Toke did some testing on beefy machines with
      high quantities of parallel flows and found that increasing the
      reply-attack counter to 8192 takes care of the most pathological cases
      pretty well.
      Reported-by: NDave Taht <dave.taht@gmail.com>
      Reviewed-and-tested-by: NToke Høiland-Jørgensen <toke@toke.dk>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c78a0b4a