1. 09 2月, 2021 1 次提交
  2. 07 2月, 2021 6 次提交
    • V
      net: dsa: felix: propagate the LAG offload ops towards the ocelot lib · 8fe6832e
      Vladimir Oltean 提交于
      The ocelot switch has been supporting LAG offload since its initial
      commit, however felix could not make use of that, due to lack of a LAG
      abstraction in DSA. Now that we have that, let's forward DSA's calls
      towards the ocelot library, who will deal with setting up the bonding.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8fe6832e
    • V
      net: mscc: ocelot: rebalance LAGs on link up/down events · 23ca3b72
      Vladimir Oltean 提交于
      At present there is an issue when ocelot is offloading a bonding
      interface, but one of the links of the physical ports goes down. Traffic
      keeps being hashed towards that destination, and of course gets dropped
      on egress.
      
      Monitor the netdev notifier events emitted by the bonding driver for
      changes in the physical state of lower interfaces, to determine which
      ports are active and which ones are no longer.
      
      Then extend ocelot_get_bond_mask to return either the configured bonding
      interfaces, or the active ones, depending on a boolean argument. The
      code that does rebalancing only needs to do so among the active ports,
      whereas the bridge forwarding mask and the logical port IDs still need
      to look at the permanently bonded ports.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      23ca3b72
    • V
      net: mscc: ocelot: drop the use of the "lags" array · 528d3f19
      Vladimir Oltean 提交于
      We can now simplify the implementation by always using ocelot_get_bond_mask
      to look up the other ports that are offloading the same bonding interface
      as us.
      
      In ocelot_set_aggr_pgids, the code had a way to uniquely iterate through
      LAGs. We need to achieve the same behavior by marking each LAG as visited,
      which we do now by using a temporary 32-bit "visited" bitmask. This is
      ok and we do not need dynamic memory allocation, because we know that
      this switch architecture will not have more than 32 ports (the PGID port
      masks are 32-bit anyway).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      528d3f19
    • V
      net: mscc: ocelot: set up the bonding mask in a way that avoids a net_device · b80af659
      Vladimir Oltean 提交于
      Since this code should be called from pure switchdev as well as from
      DSA, we must find a way to determine the bonding mask not by looking
      directly at the net_device lowers of the bonding interface, since those
      could have different private structures.
      
      We keep a pointer to the bonding upper interface, if present, in struct
      ocelot_port. Then the bonding mask becomes the bitwise OR of all ports
      that have the same bonding upper interface. This adds a duplication of
      functionality with the current "lags" array, but the duplication will be
      short-lived, since further patches will remove the latter completely.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b80af659
    • K
      net: Introduce {netdev,napi}_alloc_frag_align() · 3f6e687d
      Kevin Hao 提交于
      In the current implementation of {netdev,napi}_alloc_frag(), it doesn't
      have any align guarantee for the returned buffer address, But for some
      hardwares they do require the DMA buffer to be aligned correctly,
      so we would have to use some workarounds like below if the buffers
      allocated by the {netdev,napi}_alloc_frag() are used by these hardwares
      for DMA.
          buf = napi_alloc_frag(really_needed_size + align);
          buf = PTR_ALIGN(buf, align);
      
      These codes seems ugly and would waste a lot of memories if the buffers
      are used in a network driver for the TX/RX. We have added the align
      support for the page_frag functions, so add the corresponding
      {netdev,napi}_frag functions.
      Signed-off-by: NKevin Hao <haokexin@gmail.com>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3f6e687d
    • K
      mm: page_frag: Introduce page_frag_alloc_align() · b358e212
      Kevin Hao 提交于
      In the current implementation of page_frag_alloc(), it doesn't have
      any align guarantee for the returned buffer address. But for some
      hardwares they do require the DMA buffer to be aligned correctly,
      so we would have to use some workarounds like below if the buffers
      allocated by the page_frag_alloc() are used by these hardwares for
      DMA.
          buf = page_frag_alloc(really_needed_size + align);
          buf = PTR_ALIGN(buf, align);
      
      These codes seems ugly and would waste a lot of memories if the buffers
      are used in a network driver for the TX/RX. So introduce
      page_frag_alloc_align() to make sure that an aligned buffer address is
      returned.
      Signed-off-by: NKevin Hao <haokexin@gmail.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b358e212
  3. 06 2月, 2021 4 次提交
    • S
      batman-adv: Drop publication years from copyright info · cfa55c6d
      Sven Eckelmann 提交于
      The batman-adv source code was using the year of publication (to net-next)
      as "last" year for the copyright statement. The whole source code mentioned
      in the MAINTAINERS "BATMAN ADVANCED" section was handled as a single entity
      regarding the publishing year.
      
      This avoided having outdated (in sense of year information - not copyright
      holder) publishing information inside several files. But since the simple
      "update copyright year" commit (without other changes) in the file was not
      well received in the upstream kernel, the option to not have a copyright
      year (for initial and last publication) in the files are chosen instead.
      More detailed information about the years can still be retrieved from the
      SCM system.
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Acked-by: NMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      cfa55c6d
    • V
      net/mlx5e: Match recirculated packet miss in slow table using reg_c1 · 8e404fef
      Vlad Buslov 提交于
      Previous patch in series that implements stack devices RX path implements
      indirect table rules that match on tunnel VNI. After such rule is created
      all tunnel traffic is recirculated to root table. However, recirculated
      packet might not match on any rules installed in the table (for example,
      when IP traffic follows ARP traffic). In that case packets appear on
      representor of tunnel endpoint VF instead being redirected to the VF
      itself.
      
      Extend slow table with additional flow group that matches on reg_c0 (source
      port value set by indirect tables implemented by previous patch in series)
      and reg_c1 (special 0xFFF mark). When creating offloads fdb tables, install
      one rule per VF vport to match on recirculated miss packets and redirect
      them to appropriate VF vport. Modify indirect tables code to also rewrite
      reg_c1 with special 0xFFF mark.
      
      Implementation reuses reg_c1 tunnel id bits. This is safe to do because
      recirculated packets are always matched before decapsulation.
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      8e404fef
    • V
      net/mlx5e: Refactor reg_c1 usage · 48d216e5
      Vlad Buslov 提交于
      Following patch in series uses reg_c1 in eswitch code. To use reg_c1
      helpers in both TC and eswitch code, refactor existing helpers according to
      similar use case of reg_c0 and move the functionality into eswitch.h.
      Calculate reg mappings length from new defines to ensure that they are
      always in sync and only need to be changed in single place.
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      48d216e5
    • V
      net/mlx5e: VF tunnel TX traffic offloading · 10742efc
      Vlad Buslov 提交于
      When tunnel endpoint is on VF, driver still assumes that endpoint is on
      uplink and incorrectly configures encap rule offload according to that
      assumption. As a result, traffic is sent directly to the uplink and rules
      installed on representor of tunnel endpoint VF are ignored.
      
      Implement following changes to allow offloading tx traffic with tunnel
      endpoint on VF:
      
      - For tunneling flows perform route lookup on route and out devices pair.
      If out device is uplink and route device is VF of same physical port, then
      modify packet reg_c_0 metadata register (source port) with the value of VF
      vport. Use eswitch vhca_id->vport mapping introduced in one of previous
      patches in the series to obtain vport from route netdevice.
      
      - Recirculate encapsulated packets to VF vport in order to apply any flow
      rules installed on VF representor that match on encapsulated traffic.
      
      Only enable support for this functionality when all following conditions
      are true:
      
      - Hardware advertises capability to preserve reg_c_0 value on packet
      recirculation.
      
      - Vport metadata matching is enabled.
      
      - Termination tables are to be used by the flow.
      
      Example TC rules for VF tunnel traffic:
      
      1. Rule that redirects packets from UL to VF rep that has the tunnel
      endpoint IP address:
      
      $ tc -s filter show dev enp8s0f0 ingress
      filter protocol ip pref 4 flower chain 0
      filter protocol ip pref 4 flower chain 0 handle 0x1
        dst_mac 16:c9:a0:2d:69:2c
        src_mac 0c:42:a1:58:ab:e4
        eth_type ipv4
        ip_flags nofrag
        in_hw in_hw_count 1
              action order 1: mirred (Egress Redirect to device enp8s0f0_0) stolen
              index 3 ref 1 bind 1 installed 377 sec used 0 sec
              Action statistics:
              Sent 114096 bytes 952 pkt (dropped 0, overlimits 0 requeues 0)
              Sent software 0 bytes 0 pkt
              Sent hardware 114096 bytes 952 pkt
              backlog 0b 0p requeues 0
              cookie 878fa48d8c423fc08c3b6ca599b50a97
              no_percpu
              used_hw_stats delayed
      
      2. Rule that decapsulates the tunneled flow and redirects to destination VF
      representor:
      
      $ tc -s filter show dev vxlan_sys_4789 ingress
      filter protocol ip pref 4 flower chain 0
      filter protocol ip pref 4 flower chain 0 handle 0x1
        dst_mac ca:2e:a7:3f:f5:0f
        src_mac 0a:40:bd:30:89:99
        eth_type ipv4
        enc_dst_ip 7.7.7.5
        enc_src_ip 7.7.7.1
        enc_key_id 98
        enc_dst_port 4789
        enc_tos 0
        ip_flags nofrag
        in_hw in_hw_count 1
              action order 1: tunnel_key  unset pipe
               index 2 ref 1 bind 1 installed 434 sec used 434 sec
              Action statistics:
              Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0
              used_hw_stats delayed
      
              action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen
              index 4 ref 1 bind 1 installed 434 sec used 0 sec
              Action statistics:
              Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0)
              Sent software 0 bytes 0 pkt
              Sent hardware 129936 bytes 1082 pkt
              backlog 0b 0p requeues 0
              cookie ac17cf398c4c69e4a5b2f7aabd1b88ff
              no_percpu
              used_hw_stats delayed
      Co-developed-by: NDmytro Linkin <dlinkin@nvidia.com>
      Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      10742efc
  4. 05 2月, 2021 12 次提交
  5. 04 2月, 2021 8 次提交
  6. 03 2月, 2021 3 次提交
    • A
      net: ipv6: Emit notification when fib hardware flags are changed · 907eea48
      Amit Cohen 提交于
      After installing a route to the kernel, user space receives an
      acknowledgment, which means the route was installed in the kernel,
      but not necessarily in hardware.
      
      The asynchronous nature of route installation in hardware can lead
      to a routing daemon advertising a route before it was actually installed in
      hardware. This can result in packet loss or mis-routed packets until the
      route is installed in hardware.
      
      It is also possible for a route already installed in hardware to change
      its action and therefore its flags. For example, a host route that is
      trapping packets can be "promoted" to perform decapsulation following
      the installation of an IPinIP/VXLAN tunnel.
      
      Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
      are changed. The aim is to provide an indication to user-space
      (e.g., routing daemons) about the state of the route in hardware.
      
      Introduce a sysctl that controls this behavior.
      
      Keep the default value at 0 (i.e., do not emit notifications) for several
      reasons:
      - Multiple RTM_NEWROUTE notification per-route might confuse existing
        routing daemons.
      - Convergence reasons in routing daemons.
      - The extra notifications will negatively impact the insertion rate.
      - Not all users are interested in these notifications.
      
      Move fib6_info_hw_flags_set() to C file because it is no longer a short
      function.
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      907eea48
    • A
      net: Pass 'net' struct as first argument to fib6_info_hw_flags_set() · fbaca8f8
      Amit Cohen 提交于
      The next patch will emit notification when hardware flags are changed,
      in case that fib_notify_on_flag_change sysctl is set to 1.
      
      To know sysctl values, net struct is needed.
      This change is consistent with the IPv4 version, which gets 'net' struct
      as its first argument.
      
      Currently, the only callers of this function are mlxsw and netdevsim.
      Patch the callers to pass net.
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      fbaca8f8
    • A
      net: ipv4: Emit notification when fib hardware flags are changed · 680aea08
      Amit Cohen 提交于
      After installing a route to the kernel, user space receives an
      acknowledgment, which means the route was installed in the kernel,
      but not necessarily in hardware.
      
      The asynchronous nature of route installation in hardware can lead to a
      routing daemon advertising a route before it was actually installed in
      hardware. This can result in packet loss or mis-routed packets until the
      route is installed in hardware.
      
      It is also possible for a route already installed in hardware to change
      its action and therefore its flags. For example, a host route that is
      trapping packets can be "promoted" to perform decapsulation following
      the installation of an IPinIP/VXLAN tunnel.
      
      Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
      are changed. The aim is to provide an indication to user-space
      (e.g., routing daemons) about the state of the route in hardware.
      
      Introduce a sysctl that controls this behavior.
      
      Keep the default value at 0 (i.e., do not emit notifications) for several
      reasons:
      - Multiple RTM_NEWROUTE notification per-route might confuse existing
        routing daemons.
      - Convergence reasons in routing daemons.
      - The extra notifications will negatively impact the insertion rate.
      - Not all users are interested in these notifications.
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Acked-by: NRoopa Prabhu <roopa@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      680aea08
  7. 02 2月, 2021 3 次提交
    • D
      udp: ipv4: manipulate network header of NATed UDP GRO fraglist · c3df39ac
      Dongseok Yi 提交于
      UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
      forwarding. Only the header of head_skb from ip_finish_output_gso ->
      skb_gso_segment is updated but following frag_skbs are not updated.
      
      A call path skb_mac_gso_segment -> inet_gso_segment ->
      udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
      does not try to update UDP/IP header of the segment list but copy
      only the MAC header.
      
      Update port, addr and check of each skb of the segment list in
      __udp_gso_segment_list. It covers both SNAT and DNAT.
      
      Fixes: 9fd1ff5d (udp: Support UDP fraglist GRO/GSO.)
      Signed-off-by: NDongseok Yi <dseok.yi@samsung.com>
      Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Link: https://lore.kernel.org/r/1611962007-80092-1-git-send-email-dseok.yi@samsung.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      c3df39ac
    • A
      net: sched: replaced invalid qdisc tree flush helper in qdisc_replace · 938e0fcd
      Alexander Ovechkin 提交于
      Commit e5f0e8f8 ("net: sched: introduce and use qdisc tree flush/purge helpers")
      introduced qdisc tree flush/purge helpers, but erroneously used flush helper
      instead of purge helper in qdisc_replace function.
      This issue was found in our CI, that tests various qdisc setups by configuring
      qdisc and sending data through it. Call of invalid helper sporadically leads
      to corruption of vt_tree/cf_tree of hfsc_class that causes kernel oops:
      
       Oops: 0000 [#1] SMP PTI
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-8f6859df #1
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       RIP: 0010:rb_insert_color+0x18/0x190
       Code: c3 31 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 07 48 85 c0 0f 84 05 01 00 00 48 8b 10 f6 c2 01 0f 85 34 01 00 00 <48> 8b 4a 08 49 89 d0 48 39 c1 74 7d 48 85 c9 74 32 f6 01 01 75 2d
       RSP: 0018:ffffc900000b8bb0 EFLAGS: 00010246
       RAX: ffff8881ef4c38b0 RBX: ffff8881d956e400 RCX: ffff8881ef4c38b0
       RDX: 0000000000000000 RSI: ffff8881d956f0a8 RDI: ffff8881d956e4b0
       RBP: 0000000000000000 R08: 000000d5c4e249da R09: 1600000000000000
       R10: ffffc900000b8be0 R11: ffffc900000b8b28 R12: 0000000000000001
       R13: 000000000000005a R14: ffff8881f0905000 R15: ffff8881f0387d00
       FS:  0000000000000000(0000) GS:ffff8881f8b00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000008 CR3: 00000001f4796004 CR4: 0000000000060ee0
       Call Trace:
        <IRQ>
        init_vf.isra.19+0xec/0x250 [sch_hfsc]
        hfsc_enqueue+0x245/0x300 [sch_hfsc]
        ? fib_rules_lookup+0x12a/0x1d0
        ? __dev_queue_xmit+0x4b6/0x930
        ? hfsc_delete_class+0x250/0x250 [sch_hfsc]
        __dev_queue_xmit+0x4b6/0x930
        ? ip6_finish_output2+0x24d/0x590
        ip6_finish_output2+0x24d/0x590
        ? ip6_output+0x6c/0x130
        ip6_output+0x6c/0x130
        ? __ip6_finish_output+0x110/0x110
        mld_sendpack+0x224/0x230
        mld_ifc_timer_expire+0x186/0x2c0
        ? igmp6_group_dropped+0x200/0x200
        call_timer_fn+0x2d/0x150
        run_timer_softirq+0x20c/0x480
        ? tick_sched_do_timer+0x60/0x60
        ? tick_sched_timer+0x37/0x70
        __do_softirq+0xf7/0x2cb
        irq_exit+0xa0/0xb0
        smp_apic_timer_interrupt+0x74/0x150
        apic_timer_interrupt+0xf/0x20
        </IRQ>
      
      Fixes: e5f0e8f8 ("net: sched: introduce and use qdisc tree flush/purge helpers")
      Signed-off-by: NAlexander Ovechkin <ovov@yandex-team.ru>
      Reported-by: NAlexander Kuznetsov <wwfq@yandex-team.ru>
      Acked-by: NDmitry Monakhov <dmtrmonakhov@yandex-team.ru>
      Acked-by: NDmitry Yakunin <zeil@yandex-team.ru>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Link: https://lore.kernel.org/r/20210201200049.299153-1-ovov@yandex-team.ruSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      938e0fcd
    • J
      cfg80211: fix netdev registration deadlock · 40c575d1
      Johannes Berg 提交于
      If register_netdevice() fails after having called cfg80211's
      netdev notifier (cfg80211_netdev_notifier_call) it will call
      the notifier again with UNREGISTER. This would then lock the
      wiphy mutex because we're marked as registered, which causes
      a deadlock.
      
      Fix this by separately keeping track of whether or not we're
      in the middle of registering to also skip the notifier call
      on this unregister.
      
      Reported-by: syzbot+2ae0ca9d7737ad1a62b7@syzkaller.appspotmail.com
      Fixes: a05829a7 ("cfg80211: avoid holding the RTNL when calling the driver")
      Link: https://lore.kernel.org/r/20210201192048.ed8bad436737.I7cae042c44b15f80919a285799a15df467e9d42d@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      40c575d1
  8. 30 1月, 2021 3 次提交
    • N
      tcp: shrink inet_connection_sock icsk_mtup enabled and probe_size · 14e8e0f6
      Neal Cardwell 提交于
      This commit shrinks inet_connection_sock by 4 bytes, by shrinking
      icsk_mtup.enabled from 32 bits to 1 bit, and shrinking
      icsk_mtup.probe_size from s32 to an unsuigned 31 bit field.
      
      This is to save space to compensate for the recent introduction of a
      new u32 in inet_connection_sock, icsk_probes_tstamp, in the recent bug
      fix commit 9d9b1ee0 ("tcp: fix TCP_USER_TIMEOUT with zero window").
      
      This should not change functionality, since icsk_mtup.enabled is only
      ever set to 0 or 1, and icsk_mtup.probe_size can only be either 0
      or a positive MTU value returned by tcp_mss_to_mtu()
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20210129185438.1813237-1-ncardwell.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      14e8e0f6
    • V
      net: dsa: felix: perform switch setup for tag_8021q · e21268ef
      Vladimir Oltean 提交于
      Unlike sja1105, the only other user of the software-defined tag_8021q.c
      tagger format, the implementation we choose for the Felix DSA switch
      driver preserves full functionality under a vlan_filtering bridge
      (i.e. IP termination works through the DSA user ports under all
      circumstances).
      
      The tag_8021q protocol just wants:
      - Identifying the ingress switch port based on the RX VLAN ID, as seen
        by the CPU. We achieve this by using the TCAM engines (which are also
        used for tc-flower offload) to push the RX VLAN as a second, outer
        tag, on egress towards the CPU port.
      - Steering traffic injected into the switch from the network stack
        towards the correct front port based on the TX VLAN, and consuming
        (popping) that header on the switch's egress.
      
      A tc-flower pseudocode of the static configuration done by the driver
      would look like this:
      
      $ tc qdisc add dev <cpu-port> clsact
      $ for eth in swp0 swp1 swp2 swp3; do \
      	tc filter add dev <cpu-port> egress flower indev ${eth} \
      		action vlan push id <rxvlan> protocol 802.1ad; \
      	tc filter add dev <cpu-port> ingress protocol 802.1Q flower
      		vlan_id <txvlan> action vlan pop \
      		action mirred egress redirect dev ${eth}; \
      done
      
      but of course since DSA does not register network interfaces for the CPU
      port, this configuration would be impossible for the user to do. Also,
      due to the same reason, it is impossible for the user to inadvertently
      delete these rules using tc. These rules do not collide in any way with
      tc-flower, they just consume some TCAM space, which is something we can
      live with.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e21268ef
    • V
      net: dsa: add a second tagger for Ocelot switches based on tag_8021q · 7c83a7c5
      Vladimir Oltean 提交于
      There are use cases for which the existing tagger, based on the NPI
      (Node Processor Interface) functionality, is insufficient.
      
      Namely:
      - Frames injected through the NPI port bypass the frame analyzer, so no
        source address learning is performed, no TSN stream classification,
        etc.
      - Flow control is not functional over an NPI port (PAUSE frames are
        encapsulated in the same Extraction Frame Header as all other frames)
      - There can be at most one NPI port configured for an Ocelot switch. But
        in NXP LS1028A and T1040 there are two Ethernet CPU ports. The non-NPI
        port is currently either disabled, or operated as a plain user port
        (albeit an internally-facing one). Having the ability to configure the
        two CPU ports symmetrically could pave the way for e.g. creating a LAG
        between them, to increase bandwidth seamlessly for the system.
      
      So there is a desire to have an alternative to the NPI mode. This change
      keeps the default tagger for the Seville and Felix switches as "ocelot",
      but it can be changed via the following device attribute:
      
      echo ocelot-8021q > /sys/class/<dsa-master>/dsa/tagging
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      7c83a7c5