1. 30 1月, 2021 1 次提交
    • V
      net: dsa: allow changing the tag protocol via the "tagging" device attribute · 53da0eba
      Vladimir Oltean 提交于
      Currently DSA exposes the following sysfs:
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot
      
      which is a read-only device attribute, introduced in the kernel as
      commit 98cdb480 ("net: dsa: Expose tagging protocol to user-space"),
      and used by libpcap since its commit 993db3800d7d ("Add support for DSA
      link-layer types").
      
      It would be nice if we could extend this device attribute by making it
      writable:
      $ echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
      
      This is useful with DSA switches that can make use of more than one
      tagging protocol. It may be useful in dsa_loop in the future too, to
      perform offline testing of various taggers, or for changing between dsa
      and edsa on Marvell switches, if that is desirable.
      
      In terms of implementation, drivers can support this feature by
      implementing .change_tag_protocol, which should always leave the switch
      in a consistent state: either with the new protocol if things went well,
      or with the old one if something failed. Teardown of the old protocol,
      if necessary, must be handled by the driver.
      
      Some things remain as before:
      - The .get_tag_protocol is currently only called at probe time, to load
        the initial tagging protocol driver. Nonetheless, new drivers should
        report the tagging protocol in current use now.
      - The driver should manage by itself the initial setup of tagging
        protocol, no later than the .setup() method, as well as destroying
        resources used by the last tagger in use, no earlier than the
        .teardown() method.
      
      For multi-switch DSA trees, error handling is a bit more complicated,
      since e.g. the 5th out of 7 switches may fail to change the tag
      protocol. When that happens, a revert to the original tag protocol is
      attempted, but that may fail too, leaving the tree in an inconsistent
      state despite each individual switch implementing .change_tag_protocol
      transactionally. Since the intersection between drivers that implement
      .change_tag_protocol and drivers that support D in DSA is currently the
      empty set, the possibility for this error to happen is ignored for now.
      
      Testing:
      
      $ insmod mscc_felix.ko
      [   79.549784] mscc_felix 0000:00:00.5: Adding to iommu group 14
      [   79.565712] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517
      $ insmod tag_ocelot.ko
      $ rmmod mscc_felix.ko
      $ insmod mscc_felix.ko
      [   97.261724] libphy: VSC9959 internal MDIO bus: probed
      [   97.267363] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
      [   97.274998] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
      [   97.282561] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
      [   97.289700] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
      [   97.599163] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   97.862034] mscc_felix 0000:00:00.5 swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   97.950731] mscc_felix 0000:00:00.5 swp0: configuring for inband/qsgmii link mode
      [   97.964278] 8021q: adding VLAN 0 to HW filter on device swp0
      [   98.146161] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   98.238649] mscc_felix 0000:00:00.5 swp1: configuring for inband/qsgmii link mode
      [   98.251845] 8021q: adding VLAN 0 to HW filter on device swp1
      [   98.433916] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   98.485542] mscc_felix 0000:00:00.5: configuring for fixed/internal link mode
      [   98.503584] mscc_felix 0000:00:00.5: Link is Up - 2.5Gbps/Full - flow control rx/tx
      [   98.527948] device eno2 entered promiscuous mode
      [   98.544755] DSA: tree 0 setup
      
      $ ping 10.0.0.1
      PING 10.0.0.1 (10.0.0.1): 56 data bytes
      64 bytes from 10.0.0.1: seq=0 ttl=64 time=2.337 ms
      64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.754 ms
      ^C
       -  10.0.0.1 ping statistics  -
      2 packets transmitted, 2 packets received, 0% packet loss
      round-trip min/avg/max = 0.754/1.545/2.337 ms
      
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot
      $ cat ./test_ocelot_8021q.sh
              #!/bin/bash
      
              ip link set swp0 down
              ip link set swp1 down
              ip link set swp2 down
              ip link set swp3 down
              ip link set swp5 down
              ip link set eno2 down
              echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
              ip link set eno2 up
              ip link set swp0 up
              ip link set swp1 up
              ip link set swp2 up
              ip link set swp3 up
              ip link set swp5 up
      $ ./test_ocelot_8021q.sh
      ./test_ocelot_8021q.sh: line 9: echo: write error: Protocol not available
      $ rmmod tag_ocelot.ko
      rmmod: can't unload module 'tag_ocelot': Resource temporarily unavailable
      $ insmod tag_ocelot_8021q.ko
      $ ./test_ocelot_8021q.sh
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot-8021q
      $ rmmod tag_ocelot.ko
      $ rmmod tag_ocelot_8021q.ko
      rmmod: can't unload module 'tag_ocelot_8021q': Resource temporarily unavailable
      $ ping 10.0.0.1
      PING 10.0.0.1 (10.0.0.1): 56 data bytes
      64 bytes from 10.0.0.1: seq=0 ttl=64 time=0.953 ms
      64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.787 ms
      64 bytes from 10.0.0.1: seq=2 ttl=64 time=0.771 ms
      $ rmmod mscc_felix.ko
      [  645.544426] mscc_felix 0000:00:00.5: Link is Down
      [  645.838608] DSA: tree 0 torn down
      $ rmmod tag_ocelot_8021q.ko
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      53da0eba
  2. 15 1月, 2021 1 次提交
  3. 08 1月, 2021 1 次提交
  4. 10 11月, 2020 1 次提交
  5. 05 10月, 2020 1 次提交
  6. 03 10月, 2020 1 次提交
  7. 19 9月, 2020 2 次提交
  8. 08 5月, 2020 1 次提交
    • V
      net: dsa: introduce a dsa_port_from_netdev public helper · e1eea811
      Vladimir Oltean 提交于
      As its implementation shows, this is synonimous with calling
      dsa_slave_dev_check followed by dsa_slave_to_port, so it is quite simple
      already and provides functionality which is already there.
      
      However there is now a need for these functions outside dsa_priv.h, for
      example in drivers that perform mirroring and redirection through
      tc-flower offloads (they are given raw access to the flow_cls_offload
      structure), where they need to call this function on act->dev.
      
      But simply exporting dsa_slave_to_port would make it non-inline and
      would result in an extra function call in the hotpath, as can be seen
      for example in sja1105:
      
      Before:
      
      000006dc <sja1105_xmit>:
      {
       6dc:	e92d4ff0 	push	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
       6e0:	e1a04000 	mov	r4, r0
       6e4:	e591958c 	ldr	r9, [r1, #1420]	; 0x58c <- Inline dsa_slave_to_port
       6e8:	e1a05001 	mov	r5, r1
       6ec:	e24dd004 	sub	sp, sp, #4
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       6f0:	e1c901d8 	ldrd	r0, [r9, #24]
       6f4:	ebfffffe 	bl	0 <dsa_8021q_tx_vid>
      			6f4: R_ARM_CALL	dsa_8021q_tx_vid
      	u8 pcp = netdev_txq_to_tc(netdev, queue_mapping);
       6f8:	e1d416b0 	ldrh	r1, [r4, #96]	; 0x60
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       6fc:	e1a08000 	mov	r8, r0
      
      After:
      
      000006e4 <sja1105_xmit>:
      {
       6e4:	e92d4ff0 	push	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
       6e8:	e1a04000 	mov	r4, r0
       6ec:	e24dd004 	sub	sp, sp, #4
      	struct dsa_port *dp = dsa_slave_to_port(netdev);
       6f0:	e1a00001 	mov	r0, r1
      {
       6f4:	e1a05001 	mov	r5, r1
      	struct dsa_port *dp = dsa_slave_to_port(netdev);
       6f8:	ebfffffe 	bl	0 <dsa_slave_to_port>
      			6f8: R_ARM_CALL	dsa_slave_to_port
       6fc:	e1a09000 	mov	r9, r0
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       700:	e1c001d8 	ldrd	r0, [r0, #24]
       704:	ebfffffe 	bl	0 <dsa_8021q_tx_vid>
      			704: R_ARM_CALL	dsa_8021q_tx_vid
      
      Because we want to avoid possible performance regressions, introduce
      this new function which is designed to be public.
      Suggested-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1eea811
  9. 24 4月, 2020 1 次提交
    • A
      net: dsa: add GRO support via gro_cells · e131a563
      Alexander Lobakin 提交于
      gro_cells lib is used by different encapsulating netdevices, such as
      geneve, macsec, vxlan etc. to speed up decapsulated traffic processing.
      CPU tag is a sort of "encapsulation", and we can use the same mechs to
      greatly improve overall DSA performance.
      skbs are passed to the GRO layer after removing CPU tags, so we don't
      need any new packet offload types as it was firstly proposed by me in
      the first GRO-over-DSA variant [1].
      
      The size of struct gro_cells is sizeof(void *), so hot struct
      dsa_slave_priv becomes only 4/8 bytes bigger, and all critical fields
      remain in one 32-byte cacheline.
      The other positive side effect is that drivers for network devices
      that can be shipped as CPU ports of DSA-driven switches can now use
      napi_gro_frags() to pass skbs to kernel. Packets built that way are
      completely non-linear and are likely being dropped without GRO.
      
      This was tested on to-be-mainlined-soon Ethernet driver that uses
      napi_gro_frags(), and the overall performance was on par with the
      variant from [1], sometimes even better due to minimal overhead.
      net.core.gro_normal_batch tuning may help to push it to the limit
      on particular setups and platforms.
      
      iperf3 IPoE VLAN NAT TCP forwarding (port1.218 -> port0) setup
      on 1.2 GHz MIPS board:
      
      5.7-rc2 baseline:
      
      [ID]  Interval         Transfer     Bitrate        Retr
      [ 5]  0.00-120.01 sec  9.00 GBytes  644 Mbits/sec  413  sender
      [ 5]  0.00-120.00 sec  8.99 GBytes  644 Mbits/sec       receiver
      
      Iface      RX packets  TX packets
      eth0       7097731     7097702
      port0      426050      6671829
      port1      6671681     425862
      port1.218  6671677     425851
      
      With this patch:
      
      [ID]  Interval         Transfer     Bitrate        Retr
      [ 5]  0.00-120.01 sec  12.2 GBytes  870 Mbits/sec  122  sender
      [ 5]  0.00-120.00 sec  12.2 GBytes  870 Mbits/sec       receiver
      
      Iface      RX packets  TX packets
      eth0       9474792     9474777
      port0      455200      353288
      port1      9019592     455035
      port1.218  353144      455024
      
      v2:
       - Add some performance examples in the commit message;
       - No functional changes.
      
      [1] https://lore.kernel.org/netdev/20191230143028.27313-1-alobakin@dlink.ru/Signed-off-by: NAlexander Lobakin <bloodyreaper@yandex.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e131a563
  10. 31 3月, 2020 1 次提交
  11. 06 11月, 2019 1 次提交
  12. 29 10月, 2019 1 次提交
  13. 23 10月, 2019 1 次提交
  14. 31 5月, 2019 1 次提交
  15. 08 5月, 2019 1 次提交
    • Y
      net: dsa: Fix error cleanup path in dsa_init_module · 68be9302
      YueHaibing 提交于
      BUG: unable to handle kernel paging request at ffffffffa01c5430
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:raw_notifier_chain_register+0x16/0x40
      Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
      RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
      RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
      RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
      RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
      R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
      FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
      Call Trace:
       register_netdevice_notifier+0x43/0x250
       ? 0xffffffffa01e0000
       dsa_slave_register_notifier+0x13/0x70 [dsa_core
       ? 0xffffffffa01e0000
       dsa_init_module+0x2e/0x1000 [dsa_core
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cleanup allocated resourses if there are errors,
      otherwise it will trgger memleak.
      
      Fixes: c9eb3e0f ("net: dsa: Add support for learning FDB through notification")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68be9302
  16. 01 5月, 2019 1 次提交
  17. 29 4月, 2019 9 次提交
  18. 04 3月, 2019 1 次提交
  19. 17 12月, 2018 1 次提交
  20. 17 9月, 2018 1 次提交
  21. 13 9月, 2018 1 次提交
  22. 08 9月, 2018 1 次提交
  23. 28 8月, 2018 1 次提交
  24. 15 2月, 2018 1 次提交
  25. 13 11月, 2017 1 次提交
  26. 28 10月, 2017 1 次提交
  27. 27 10月, 2017 1 次提交
  28. 18 10月, 2017 1 次提交
  29. 13 10月, 2017 1 次提交
  30. 01 10月, 2017 1 次提交
  31. 20 9月, 2017 1 次提交