1. 30 1月, 2021 12 次提交
    • V
      net: dsa: keep a copy of the tagging protocol in the DSA switch tree · 357f203b
      Vladimir Oltean 提交于
      Cascading DSA switches can be done multiple ways. There is the brute
      force approach / tag stacking, where one upstream switch, located
      between leaf switches and the host Ethernet controller, will just
      happily transport the DSA header of those leaf switches as payload.
      For this kind of setups, DSA works without any special kind of treatment
      compared to a single switch - they just aren't aware of each other.
      Then there's the approach where the upstream switch understands the tags
      it transports from its leaves below, as it doesn't push a tag of its own,
      but it routes based on the source port & switch id information present
      in that tag (as opposed to DMAC & VID) and it strips the tag when
      egressing a front-facing port. Currently only Marvell implements the
      latter, and Marvell DSA trees contain only Marvell switches.
      
      So it is safe to say that DSA trees already have a single tag protocol
      shared by all switches, and in fact this is what makes the switches able
      to understand each other. This fact is also implied by the fact that
      currently, the tagging protocol is reported as part of a sysfs installed
      on the DSA master and not per port, so it must be the same for all the
      ports connected to that DSA master regardless of the switch that they
      belong to.
      
      It's time to make this official and enforce it (yes, this also means we
      won't have any "switch understands tag to some extent but is not able to
      speak it" hardware oddities that we'll support in the future).
      
      This is needed due to the imminent introduction of the dsa_switch_ops::
      change_tag_protocol driver API. When that is introduced, we'll have
      to notify switches of the tagging protocol that they're configured to
      use. Currently the tag_ops structure pointer is held only for CPU ports.
      But there are switches which don't have CPU ports and nonetheless still
      need to be configured. These would be Marvell leaf switches whose
      upstream port is just a DSA link. How do we inform these of their
      tagging protocol setup/deletion?
      
      One answer to the above would be: iterate through the DSA switch tree's
      ports once, list the CPU ports, get their tag_ops, then iterate again
      now that we have it, and notify everybody of that tag_ops. But what to
      do if conflicts appear between one cpu_dp->tag_ops and another? There's
      no escaping the fact that conflict resolution needs to be done, so we
      can be upfront about it.
      
      Ease our work and just keep the master copy of the tag_ops inside the
      struct dsa_switch_tree. Reference counting is now moved to be per-tree
      too, instead of per-CPU port.
      
      There are many places in the data path that access master->dsa_ptr->tag_ops
      and we would introduce unnecessary performance penalty going through yet
      another indirection, so keep those right where they are.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      357f203b
    • V
      net: dsa: document the existing switch tree notifiers and add a new one · 886f8e26
      Vladimir Oltean 提交于
      The existence of dsa_broadcast has generated some confusion in the past:
      https://www.mail-archive.com/netdev@vger.kernel.org/msg365042.html
      
      So let's document the existing dsa_port_notify and dsa_broadcast
      functions and explain when each of them should be used.
      
      Also, in fact, the in-between function has always been there but was
      lacking a name, and is the main reason for this patch: dsa_tree_notify.
      Refactor dsa_broadcast to use it.
      
      This patch also moves dsa_broadcast (a top-level function) to dsa2.c,
      where it really belonged in the first place, but had no companion so it
      stood with dsa_port_notify.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      886f8e26
    • V
      net: dsa: tag_8021q: add helpers to deduce whether a VLAN ID is RX or TX VLAN · 9c7caf28
      Vladimir Oltean 提交于
      The sja1105 implementation can be blind about this, but the felix driver
      doesn't do exactly what it's being told, so it needs to know whether it
      is a TX or an RX VLAN, so it can install the appropriate type of TCAM
      rule.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      9c7caf28
    • E
      net: proc: speedup /proc/net/netstat · 0d6cd689
      Eric Dumazet 提交于
      Use cache friendly helpers to better use cpu caches
      while reading /proc/net/netstat
      
      Tested on a platform with 256 threads (AMD Rome)
      
      Before: 305 usec spent in netstat_seq_show()
      After: 130 usec spent in netstat_seq_show()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20210128162145.1703601-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0d6cd689
    • K
      net: Remove redundant calls of sk_tx_queue_clear(). · df610cd9
      Kuniyuki Iwashima 提交于
      The commit 41b14fb8 ("net: Do not clear the sock TX queue in
      sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds
      it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in
      the commit e022f0b4 ("net: Introduce sk_tx_queue_mapping"). On the
      other hand, the original commit had already put sk_tx_queue_clear() in
      sk_prot_alloc(): the callee of sk_alloc() and sk_clone_lock(). Thus
      sk_tx_queue_clear() is called twice in each path.
      
      If we remove sk_tx_queue_clear() in sk_alloc() and sk_clone_lock(), it
      currently works well because (i) sk_tx_queue_mapping is defined between
      sk_dontcopy_begin and sk_dontcopy_end, and (ii) sock_copy() called after
      sk_prot_alloc() in sk_clone_lock() does not overwrite sk_tx_queue_mapping.
      However, if we move sk_tx_queue_mapping out of the no copy area, it
      introduces a bug unintentionally.
      
      Therefore, this patch adds a compile-time check to take care of the order
      of sock_copy() and sk_tx_queue_clear() and removes sk_tx_queue_clear() from
      sk_prot_alloc() so that it does the only allocation and its callers
      initialize fields.
      
      CC: Boris Pismenny <borisp@mellanox.com>
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Acked-by: NTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20210128150217.6060-1-kuniyu@amazon.co.jpSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      df610cd9
    • X
      ip_gre: add csum offload support for gre header · efa1a65c
      Xin Long 提交于
      This patch is to add csum offload support for gre header:
      
      On the TX path in gre_build_header(), when CHECKSUM_PARTIAL's set
      for inner proto, it will calculate the csum for outer proto, and
      inner csum will be offloaded later. Otherwise, CHECKSUM_PARTIAL
      and csum_start/offset will be set for outer proto, and the outer
      csum will be offloaded later.
      
      On the GSO path in gre_gso_segment(), when CHECKSUM_PARTIAL is
      not set for inner proto and the hardware supports csum offload,
      CHECKSUM_PARTIAL and csum_start/offset will be set for outer
      proto, and outer csum will be offloaded later. Otherwise, it
      will do csum for outer proto by calling gso_make_checksum().
      
      Note that SCTP has to do the csum by itself for non GSO path in
      sctp_packet_pack(), as gre_build_header() can't handle the csum
      with CHECKSUM_PARTIAL set for SCTP CRC csum offload.
      
      v1->v2:
        - remove the SCTP part, as GRE dev doesn't support SCTP CRC CSUM
          and it will always do checksum for SCTP in sctp_packet_pack()
          when it's not a GSO packet.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      efa1a65c
    • X
      net: support ip generic csum processing in skb_csum_hwoffload_help · 62fafcd6
      Xin Long 提交于
      NETIF_F_IP|IPV6_CSUM feature flag indicates UDP and TCP csum offload
      while NETIF_F_HW_CSUM feature flag indicates ip generic csum offload
      for HW, which includes not only for TCP/UDP csum, but also for other
      protocols' csum like GRE's.
      
      However, in skb_csum_hwoffload_help() it only checks features against
      NETIF_F_CSUM_MASK(NETIF_F_HW|IP|IPV6_CSUM). So if it's a non TCP/UDP
      packet and the features doesn't support NETIF_F_HW_CSUM, but supports
      NETIF_F_IP|IPV6_CSUM only, it would still return 0 and leave the HW
      to do csum.
      
      This patch is to support ip generic csum processing by checking
      NETIF_F_HW_CSUM for all protocols, and check (NETIF_F_IP_CSUM |
      NETIF_F_IPV6_CSUM) only for TCP and UDP.
      
      Note that we're using skb->csum_offset to check if it's a TCP/UDP
      proctol, this might be fragile. However, as Alex said, for now we
      only have a few L4 protocols that are requesting Tx csum offload,
      we'd better fix this until a new protocol comes with a same csum
      offset.
      
      v1->v2:
        - not extend skb->csum_not_inet, but use skb->csum_offset to tell
          if it's an UDP/TCP csum packet.
      v2->v3:
        - add a note in the changelog, as Willem suggested.
      Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      62fafcd6
    • E
      net: atm: pppoatm: use new API for wakeup tasklet · a5874597
      Emil Renner Berthing 提交于
      This converts the driver to use the new tasklet API introduced in
      commit 12cc923f ("tasklet: Introduce new initialization API")
      Signed-off-by: NEmil Renner Berthing <kernel@esmil.dk>
      Link: https://lore.kernel.org/r/20210127173256.13954-2-kernel@esmil.dkSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      a5874597
    • E
      net: atm: pppoatm: use tasklet_init to initialize wakeup tasklet · a5b88632
      Emil Renner Berthing 提交于
      Previously a temporary tasklet structure was initialized on the stack
      using DECLARE_TASKLET_OLD() and then copied over and modified. Nothing
      else in the kernel seems to use this pattern, so let's just call
      tasklet_init() like everyone else.
      Signed-off-by: NEmil Renner Berthing <kernel@esmil.dk>
      Link: https://lore.kernel.org/r/20210127173256.13954-1-kernel@esmil.dkSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      a5b88632
    • P
      net: flow_offload: Add original direction flag to ct_metadata · 941eff5a
      Paul Blakey 提交于
      Give offloading drivers the direction of the offloaded ct flow,
      this will be used for matches on direction (ct_state +/-rpl).
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      941eff5a
    • P
      net/sched: cls_flower: Add match on the ct_state reply flag · 8c85d18c
      Paul Blakey 提交于
      Add match on the ct_state reply flag.
      
      Example:
      $ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
        ct_state +trk+est+rpl \
        action mirred egress redirect dev ens1f0_1
      $ tc filter add dev ens1f0_1 ingress prio 1 chain 1 proto ip flower \
        ct_state +trk+est-rpl \
        action mirred egress redirect dev ens1f0_0
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8c85d18c
    • M
      net: packet: make pkt_sk() inline · 8c224751
      Menglong Dong 提交于
      It's better make 'pkt_sk()' inline here, as non-inline function
      shouldn't occur in headers. Besides, this function is simple
      enough to be inline.
      Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn>
      Link: https://lore.kernel.org/r/20210127123302.29842-1-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      8c224751
  2. 29 1月, 2021 19 次提交
  3. 28 1月, 2021 9 次提交