1. 19 6月, 2020 1 次提交
  2. 17 6月, 2020 1 次提交
  3. 16 6月, 2020 1 次提交
  4. 23 5月, 2020 1 次提交
    • R
      net: mvpp2: fix RX hashing for non-10G ports · 3138a07c
      Russell King 提交于
      When rxhash is enabled on any ethernet port except the first in each CP
      block, traffic flow is prevented.  The analysis is below:
      
      I've been investigating this afternoon, and what I've found, comparing
      a kernel without 895586d5 and with 895586d5 applied is:
      
      - The table programmed into the hardware via mvpp22_rss_fill_table()
        appears to be identical with or without the commit.
      
      - When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
        that c2.attr[0] and c2.attr[2] are written back containing:
      
         - with 895586d5, failing:    00200000 40000000
         - without 895586d5, working: 04000000 40000000
      
      - When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:
      
         04000000 00000000
      
      The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
      first value is the queue number, which comprises two fields. The high
      5 bits are 24:29 and the low three are 21:23 inclusive. This comes
      from:
      
             c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
                           MVPP22_CLS_C2_ATTR0_QLOW(ql);
      
      So, the working case gives eth2 a queue id of 4.0, or 32 as per
      port->first_rxq, and the non-working case a queue id of 0.1, or 1.
      The allocation of queue IDs seems to be in mvpp2_port_probe():
      
              if (priv->hw_version == MVPP21)
                      port->first_rxq = port->id * port->nrxqs;
              else
                      port->first_rxq = port->id * priv->max_port_rxqs;
      
      Where:
      
              if (priv->hw_version == MVPP21)
                      priv->max_port_rxqs = 8;
              else
                      priv->max_port_rxqs = 32;
      
      Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
      (eth2) be 32. It seems the idea is that the first 32 queues belong to
      port 0, the second 32 queues belong to port 1, etc.
      
      mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
      'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns
      port->rss_ctx[0].
      
      mvpp22_rss_context_create() is responsible for allocating that, which
      it does by looking for an unallocated priv->rss_tables[] pointer. This
      table is shared amongst all ports on the CP silicon.
      
      When we write the tables in mvpp22_rss_fill_table(), the RSS table
      entry is defined by:
      
                      u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
                                MVPP22_RSS_INDEX_TABLE_ENTRY(i);
      
      where rss_ctx is the context ID (queue number) and i is the index in
      the table.
      
      If we look at what is written:
      
      - The first table to be written has "sel" values of 00000000..0000001f,
        containing values 0..3. This appears to be for eth1. This is table 0,
        RX queue number 0.
      - The second table has "sel" values of 00000100..0000011f, and appears
        to be for eth2.  These contain values 0x20..0x23. This is table 1,
        RX queue number 0.
      - The third table has "sel" values of 00000200..0000021f, and appears
        to be for eth3.  These contain values 0x40..0x43. This is table 2,
        RX queue number 0.
      
      How do queue numbers translate to the RSS table?  There is another
      table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field
      of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
      register. Before 895586d5, it was:
      
             mvpp2_write(priv, MVPP22_RSS_INDEX,
                         MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
             mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
                         MVPP22_RSS_TABLE_POINTER(port->id));
      
      and after:
      
             mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
             mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));
      
      Before the commit, for eth2, that would've contained '32' for the
      index and '1' for the table pointer - mapping queue 32 to table 1.
      Remember that this is queue-high.queue-low of 4.0.
      
      After the commit, we appear to map queue 1 to table 1. That again
      looks fine on the face of it.
      
      Section 9.3.1 of the A8040 manual seems indicate the reason that the
      queue number is separated. queue-low seems to always come from the
      classifier, whereas queue-high can be from the ingress physical port
      number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.
      
      We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
      comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to
      be where our bug comes from.
      
      mvpp2_cls_oversize_rxq_set() sets this up as:
      
              mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id),
                          (port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));
      
              val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
              val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
              mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);
      
      Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high
      for eth2 is _always_ 4, so only queues 32 through 39 inclusive are
      available to eth2. Yet, we're trying to tell the classifier to set
      queue-high, which will be ignored, to zero. Hence, the queue-high
      field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be
      ignored.
      
      This means we end up directing traffic from eth2 not to queue 1, but
      to queue 33, and then we tell it to look up queue 33 in the RSS table.
      However, RSS table has not been programmed for queue 33, and so it ends
      up (presumably) dropping the packets.
      
      It seems that mvpp22_rss_context_create() doesn't take account of the
      fact that the upper 5 bits of the queue ID can't actually be changed
      due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that
      mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either
      way, these two functions mutually disagree with what queue number
      should be used.
      
      Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU
      validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used
      for over-sized packets attempting to egress through this port. With
      the classifier having had RSS enabled and directing eth2 traffic to
      queue 1, we may still have packets appearing on queue 32 for this port.
      
      However, the only way we may end up with over-sized packets attempting
      to egress through eth2 - is if the A8040 forwards frames between its
      ports. From what I can see, we don't support that feature, and the
      kernel restricts the egress packet size to the MTU. In any case, if we
      were to attempt to transmit an oversized packet, we have no support in
      the kernel to deal with that appearing in the port's receive queue.
      
      So, this patch attempts to solve the issue by clearing the
      MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH()
      from the classifier to define the queue-high field of the queue number.
      
      My testing seems to confirm my findings above - clearing this bit
      means that if I enable rxhash on eth2, the interface can then pass
      traffic, as we are now directing traffic to RX queue 1 rather than
      queue 33. Traffic still seems to work with rxhash off as well.
      Reported-by: NMatteo Croce <mcroce@redhat.com>
      Tested-by: NMatteo Croce <mcroce@redhat.com>
      Fixes: 895586d5 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3138a07c
  5. 07 5月, 2020 2 次提交
  6. 18 3月, 2020 1 次提交
  7. 15 3月, 2020 1 次提交
  8. 09 3月, 2020 1 次提交
  9. 28 2月, 2020 2 次提交
  10. 06 1月, 2020 1 次提交
  11. 20 12月, 2019 1 次提交
    • R
      net: mvpp2: cycle comphy to power it down · 6791c102
      Russell King 提交于
      Presently, at boot time, the comphys are enabled. For firmware
      compatibility reasons, the comphy driver does not power down the
      comphys at boot. Consequently, the ethernet comphys are left active
      until the network interfaces are brought through an up/down cycle.
      
      If the port is never used, the port wastes power needlessly. Arrange
      for the ethernet comphys to be cycled by the mvpp2 driver as if the
      interface went through an up/down cycle during driver probe, thereby
      powering them down.
      
      This saves:
        270mW per 10G SFP+ port on the Macchiatobin Single Shot (eth0/eth1)
        370mW per 10G PHY port on the Macchiatobin Double Shot (eth0/eth1)
        160mW on the SFP port on either Macchiatobin flavour (eth3)
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6791c102
  12. 18 12月, 2019 1 次提交
  13. 15 12月, 2019 1 次提交
    • R
      net: marvell: mvpp2: phylink requires the link interrupt · f3f2364e
      Russell King 提交于
      phylink requires the MAC to report when its link status changes when
      operating in inband modes.  Failure to report link status changes
      means that phylink has no idea when the link events happen, which
      results in either the network interface's carrier remaining up or
      remaining permanently down.
      
      For example, with a fiber module, if the interface is brought up and
      link is initially established, taking the link down at the far end
      will cut the optical power.  The SFP module's LOS asserts, we
      deactivate the link, and the network interface reports no carrier.
      
      When the far end is brought back up, the SFP module's LOS deasserts,
      but the MAC may be slower to establish link.  If this happens (which
      in my tests is a certainty) then phylink never hears that the MAC
      has established link with the far end, and the network interface is
      stuck reporting no carrier.  This means the interface is
      non-functional.
      
      Avoiding the link interrupt when we have phylink is basically not
      an option, so remove the !port->phylink from the test.
      
      Fixes: 4bb04326 ("net: mvpp2: phylink support")
      Tested-by: NSven Auhagen <sven.auhagen@voleatech.de>
      Tested-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      f3f2364e
  14. 24 11月, 2019 1 次提交
  15. 29 10月, 2019 3 次提交
  16. 03 10月, 2019 1 次提交
  17. 03 9月, 2019 2 次提交
    • M
      mvpp2: percpu buffers · 7d04b0b1
      Matteo Croce 提交于
      Every mvpp2 unit can use up to 8 buffers mapped by the BM (the HW buffer
      manager). The HW will place the frames in the buffer pool depending on the
      frame size: short (< 128 bytes), long (< 1664) or jumbo (up to 9856).
      
      As any unit can have up to 4 ports, the driver allocates only 2 pools,
      one for small and one long frames, and share them between ports.
      When the first port MTU is set higher than 1664 bytes, a third pool is
      allocated for jumbo frames.
      
      This shared allocation makes impossible to use percpu allocators,
      and creates contention between HW queues.
      
      If possible, i.e. if the number of possible CPU are less than 8 and jumbo
      frames are not used, switch to a new scheme: allocate 8 per-cpu pools for
      short and long frames and bind every pool to an RXQ.
      
      When the first port MTU is set higher than 1664 bytes, the allocation
      scheme is reverted to the old behaviour (3 shared pools), and when all
      ports MTU are lowered, the per-cpu buffers are allocated again.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d04b0b1
    • M
      mvpp2: refactor BM pool functions · 13616361
      Matteo Croce 提交于
      Refactor mvpp2_bm_pool_create(), mvpp2_bm_pool_destroy() and
      mvpp2_bm_pools_init() so that they accept a struct device instead
      of a struct platform_device, as they just need platform_device->dev.
      
      Removing such dependency makes the BM code more reusable in context
      where we don't have a pointer to the platform_device.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13616361
  18. 15 8月, 2019 1 次提交
  19. 11 8月, 2019 1 次提交
  20. 02 8月, 2019 2 次提交
    • Y
      mvpp2: use devm_platform_ioremap_resource() to simplify code · 3230a55b
      YueHaibing 提交于
      Use devm_platform_ioremap_resource() to simplify the code a bit.
      This is detected by coccinelle.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3230a55b
    • M
      mvpp2: fix panic on module removal · 944a83a2
      Matteo Croce 提交于
      mvpp2 uses a delayed workqueue to gather traffic statistics.
      On module removal the workqueue can be destroyed before calling
      cancel_delayed_work_sync() on its works.
      Fix it by moving the destroy_workqueue() call after mvpp2_port_remove().
      Also remove an unneeded call to flush_workqueue()
      
          # rmmod mvpp2
          [ 2743.311722] mvpp2 f4000000.ethernet eth1: phy link down 10gbase-kr/10Gbps/Full
          [ 2743.320063] mvpp2 f4000000.ethernet eth1: Link is Down
          [ 2743.572263] mvpp2 f4000000.ethernet eth2: phy link down sgmii/1Gbps/Full
          [ 2743.580076] mvpp2 f4000000.ethernet eth2: Link is Down
          [ 2744.102169] mvpp2 f2000000.ethernet eth0: phy link down 10gbase-kr/10Gbps/Full
          [ 2744.110441] mvpp2 f2000000.ethernet eth0: Link is Down
          [ 2744.115614] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
          [ 2744.115615] Mem abort info:
          [ 2744.115616]   ESR = 0x96000005
          [ 2744.115617]   Exception class = DABT (current EL), IL = 32 bits
          [ 2744.115618]   SET = 0, FnV = 0
          [ 2744.115619]   EA = 0, S1PTW = 0
          [ 2744.115620] Data abort info:
          [ 2744.115621]   ISV = 0, ISS = 0x00000005
          [ 2744.115622]   CM = 0, WnR = 0
          [ 2744.115624] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000422681000
          [ 2744.115626] [0000000000000000] pgd=0000000000000000, pud=0000000000000000
          [ 2744.115630] Internal error: Oops: 96000005 [#1] SMP
          [ 2744.115632] Modules linked in: mvpp2(-) algif_hash af_alg nls_iso8859_1 nls_cp437 vfat fat xhci_plat_hcd m25p80 spi_nor xhci_hcd mtd usbcore i2c_mv64xxx sfp usb_common marvell10g phy_generic spi_orion mdio_i2c i2c_core mvmdio phylink sbsa_gwdt ip_tables x_tables autofs4 [last unloaded: mvpp2]
          [ 2744.115654] CPU: 3 PID: 8357 Comm: kworker/3:2 Not tainted 5.3.0-rc2 #1
          [ 2744.115655] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
          [ 2744.115665] Workqueue: events_power_efficient phylink_resolve [phylink]
          [ 2744.115669] pstate: a0000085 (NzCv daIf -PAN -UAO)
          [ 2744.115675] pc : __queue_work+0x9c/0x4d8
          [ 2744.115677] lr : __queue_work+0x170/0x4d8
          [ 2744.115678] sp : ffffff801001bd50
          [ 2744.115680] x29: ffffff801001bd50 x28: ffffffc422597600
          [ 2744.115684] x27: ffffff80109ae6f0 x26: ffffff80108e4018
          [ 2744.115688] x25: 0000000000000003 x24: 0000000000000004
          [ 2744.115691] x23: ffffff80109ae6e0 x22: 0000000000000017
          [ 2744.115694] x21: ffffffc42c030000 x20: ffffffc42209e8f8
          [ 2744.115697] x19: 0000000000000000 x18: 0000000000000000
          [ 2744.115699] x17: 0000000000000000 x16: 0000000000000000
          [ 2744.115701] x15: 0000000000000010 x14: ffffffffffffffff
          [ 2744.115702] x13: ffffff8090e2b95f x12: ffffff8010e2b967
          [ 2744.115704] x11: ffffff8010906000 x10: 0000000000000040
          [ 2744.115706] x9 : ffffff80109223b8 x8 : ffffff80109223b0
          [ 2744.115707] x7 : ffffffc42bc00068 x6 : 0000000000000000
          [ 2744.115709] x5 : ffffffc42bc00000 x4 : 0000000000000000
          [ 2744.115710] x3 : 0000000000000000 x2 : 0000000000000000
          [ 2744.115712] x1 : 0000000000000008 x0 : ffffffc42c030000
          [ 2744.115714] Call trace:
          [ 2744.115716]  __queue_work+0x9c/0x4d8
          [ 2744.115718]  delayed_work_timer_fn+0x28/0x38
          [ 2744.115722]  call_timer_fn+0x3c/0x180
          [ 2744.115723]  expire_timers+0x60/0x168
          [ 2744.115724]  run_timer_softirq+0xbc/0x1e8
          [ 2744.115727]  __do_softirq+0x128/0x320
          [ 2744.115731]  irq_exit+0xa4/0xc0
          [ 2744.115734]  __handle_domain_irq+0x70/0xc0
          [ 2744.115735]  gic_handle_irq+0x58/0xa8
          [ 2744.115737]  el1_irq+0xb8/0x140
          [ 2744.115738]  console_unlock+0x3a0/0x568
          [ 2744.115740]  vprintk_emit+0x200/0x2a0
          [ 2744.115744]  dev_vprintk_emit+0x1c8/0x1e4
          [ 2744.115747]  dev_printk_emit+0x6c/0x7c
          [ 2744.115751]  __netdev_printk+0x104/0x1d8
          [ 2744.115752]  netdev_printk+0x60/0x70
          [ 2744.115756]  phylink_resolve+0x38c/0x3c8 [phylink]
          [ 2744.115758]  process_one_work+0x1f8/0x448
          [ 2744.115760]  worker_thread+0x54/0x500
          [ 2744.115762]  kthread+0x12c/0x130
          [ 2744.115764]  ret_from_fork+0x10/0x1c
          [ 2744.115768] Code: aa1403e0 97fffbbe aa0003f5 b4000700 (f9400261)
      
      Fixes: 118d6298 ("net: mvpp2: add ethtool GOP statistics")
      Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Acked-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      944a83a2
  21. 30 7月, 2019 2 次提交
  22. 23 7月, 2019 2 次提交
  23. 09 7月, 2019 2 次提交
  24. 29 6月, 2019 1 次提交
  25. 20 6月, 2019 1 次提交
  26. 19 6月, 2019 4 次提交
    • M
      net: mvpp2: cls: Add steering based on vlan Id and priority. · 1274daed
      Maxime Chevallier 提交于
      This commit allows using the vlan Id and priority as parts of the key
      for classification offload. These fields are extracted from the
      outermost tag, if multiple tags are present.
      
      Vlan Id and priority are considered as 2 different fields by the
      classifier, however the fields are both appended in the Header Extracted
      Key in the same layout as they are found in the tags. This means that
      when steering only based on the prio, a 16-bit slot is still taken in
      the HEK.
      
      The classifier doesn't allow extracting the DEI bit from the tag, so we
      explicitly prevent user from using this bit in the key.
      
      This commit adds the vlan priotity as a compatible HEK field for
      tagged traffic, meaning that we limit the possibility of extracting this
      field only to the flows that contain tagged traffic.
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1274daed
    • M
      net: mvpp2: cls: right-justify the C2 TCAM keys · 12b8e2dd
      Maxime Chevallier 提交于
      The C2 TCAM used for classification uses a key (Header Extracted Key)
      built by concatenating several fields extracted from the packet header.
      
      After a lot of trial-and-error and some guess work, it seems the HEK is
      right justified, with the first fields being stored in the MSB, then
      concatenated up until the LSB.
      
      Until now, this doesn't cause any issue since all HEK fields we use are
      full bytes. However this is an issue for the upcoming VLAN id and pri
      extraction, which aren't full bytes.
      
      Rework the way we built that TCAM key, by changing the order in which we
      append the fields.
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12b8e2dd
    • M
      net: mvpp2: cls: Only select applicable flows of classification offload · 834df6ea
      Maxime Chevallier 提交于
      The way we currently handle classification offload and RSS is by having
      dedicated lookup sequences in the flow table, each being selected
      depending on several fields being present in the packet header.
      
      We need to make sure the classification operation we want to perform can
      be done in each flow we want to insert it into. As an example,
      classifying on VLAN tag can only be done on flows used for tagged
      traffic.
      
      This commit makes sure we don't insert rules in flows we aren't
      compatible with.
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      834df6ea
    • M
      net: mvpp2: cls: Use a dedicated lu_type for the RSS lookup · c641af4f
      Maxime Chevallier 提交于
      When performing a TCAM lookup in the C2 engine, it's possible that
      multiple entries match the packet. To make sure the correct entry match
      when performing a lookup, the Flow Table can set a lookup type, which
      will be used in the TCAM lookup, thus preventing such false-positives.
      
      We need to make sure the RSS match doesn't interfere with other
      classification lookups, hence we use a dedicated lookup_type for it.
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c641af4f
  27. 13 6月, 2019 2 次提交