1. 11 7月, 2020 15 次提交
    • V
      devlink: Create generic devlink health reporter search function · bd821005
      Vladyslav Tarasiuk 提交于
      Add a generic __devlink_health_reporter_find_by_name() that can be used
      with arbitrary devlink health reporter list.
      Signed-off-by: NVladyslav Tarasiuk <vladyslavt@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd821005
    • V
      devlink: Rework devlink health reporter destructor · 3c5584bf
      Vladyslav Tarasiuk 提交于
      Devlink keeps its own reference to every reporter in a list and inits
      refcount to 1 upon reporter's creation. Existing destructor waits to
      free the memory indefinitely using msleep() until all references except
      devlink's own are put.
      
      Rework this mechanism by moving memory free routine to a separate
      function, which is called when the last reporter reference is put.
      
      Besides, it allows to call __devlink_health_reporter_destroy() while
      locked on a reporters list mutex in symmetry to
      __devlink_health_reporter_create(), which is required in follow-up
      patch.
      Signed-off-by: NVladyslav Tarasiuk <vladyslavt@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c5584bf
    • V
      devlink: Refactor devlink health reporter constructor · c57544b3
      Vladyslav Tarasiuk 提交于
      Prepare a common routine in devlink_health_reporter_create() for usage
      in similar functions for devlink port health reporters.
      Signed-off-by: NVladyslav Tarasiuk <vladyslavt@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c57544b3
    • D
      Merge tag 'mlx5-updates-2020-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · d6c7fc0c
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2020-07-09
      
      This series provides updates to mlx5 CT (connection tracking) offloads
      For more information please see tag log below.
      
      Please pull and let me know if there is any problem.
      
      The following conflict is expected when net is merged into net-next:
      to resolve just use the hunks from net-next.
      
      <<<<<<< HEAD (net-next)
      	mlx5_tc_ct_del_ft_entry(ct_priv, entry);
      	kfree(entry);
      ======= (net)
      	mlx5_tc_ct_entry_del_rules(ct_priv, entry);
      	kfree(entry);
      >>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af
      
      mlx5 connection tracking offloads updates:
      
      1)  Restore CT state from lookup in zone instead of tupleid
      
          On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
          entry and restore it, instead of the driver allocated tuple id.
      
          This improves flow insertion rate by avoiding the allocation of a header
          rewrite context to maintain the tupleid.
      
      2) Re-use modify header HW objects for identical modify actions.
      
      3) Expand tunnel register mappings
         Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
         for the tuple_id,  6 bits for tunnel mapping and 2 bits for tunnel
         options mappings.
      
         Restoring the ct state from zone lookup instead of tuple id requires
         reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
         mappings.
      
         Expand tunnel and tunnel options register mappings to 12 bit each.
      
      4) Trivial cleanup and fixes.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6c7fc0c
    • D
      Merge branch 'udp_tunnel-add-NIC-RX-port-offload-infrastructure' · 0ea46047
      David S. Miller 提交于
      Jakub Kicinski says:
      
      ====================
      udp_tunnel: add NIC RX port offload infrastructure
      
      Kernel has a facility to notify drivers about the UDP tunnel ports
      so that devices can recognize tunneled packets. This is important
      mostly for RX - devices which don't support CHECKSUM_COMPLETE can
      report checksums of inner packets, and compute RSS over inner headers.
      Some drivers also match the UDP tunnel ports also for TX, although
      doing so may lead to false positives and negatives.
      
      Unfortunately the user experience when trying to take adavantage
      of these facilities is suboptimal. First of all there is no way
      for users to check which ports are offloaded. Many drivers resort
      to printing messages to aid debugging, other use debugfs. Even worse
      the availability of the RX features (NETIF_F_RX_UDP_TUNNEL_PORT)
      is established purely on the basis of the driver having the ndos
      installed. For most drivers, however, the ability to perform offloads
      is contingent on device capabilities (driver support multiple device
      and firmware versions). Unless driver resorts to hackish clearing
      of features set incorrectly by the core - users are left guessing
      whether their device really supports UDP tunnel port offload or not.
      
      There is currently no way to indicate or configure whether RX
      features include just the checksum offload or checksum and using
      inner headers for RSS. Many drivers default to not using inner
      headers for RSS because most implementations populate the source
      port with entropy from the inner headers. This, however, is not
      always the case, for example certain switches are only able to
      use a fixed source port during encapsulation.
      
      We have also seen many driver authors get the intricacies of UDP
      tunnel port offloads wrong. Most commonly the drivers forget to
      perform reference counting, or take sleeping locks in the callbacks.
      
      This work tries to improve the situation by pulling the UDP tunnel
      port table maintenance out of the drivers. It turns out that almost
      all drivers maintain a fixed size table of ports (in most cases one
      per tunnel type), so we can take care of all the refcounting in the
      core, and let the driver specify if they need to sleep in the
      callbacks or not. The new common implementation will also support
      replacing ports - when a port is removed from a full table it will
      try to find a previously missing port to take its place.
      
      This patch only implements the core functionality along with a few
      drivers I was hoping to test manually [1] along with a test based
      on a netdevsim implementation. Following patches will convert all
      the drivers. Once that's complete we can remove the ndos, and rely
      directly on the new infrastrucutre.
      
      Then after RSS (RXFH) is converted to netlink we can add the ability
      to configure the use of inner RSS headers for UDP tunnels.
      
      [1] Unfortunately I wasn't able to, turns out 2 of the devices
      I had access to were older generation or had old FW, and they
      did not actually support UDP tunnel port notifications (see
      the second paragraph). The thrid device appears to program
      the UDP ports correctly but it generates bad UDP checksums with
      or without these patches. Long story short - I'd appreciate
      reviews and testing here..
      
      v4:
       - better build fix (hopefully this one does it..)
      v3:
       - fix build issue;
       - improve bnxt changes.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ea46047
    • J
      mlx4: convert to new udp_tunnel_nic infra · fb6f8970
      Jakub Kicinski 提交于
      Convert to new infra, make use of the ability to sleep in the callback.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Acked-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb6f8970
    • J
      bnxt: convert to new udp_tunnel_nic infra · 442a35a5
      Jakub Kicinski 提交于
      Convert to new infra, taking advantage of sleeping in callbacks.
      
      v2:
       - use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication
         that the offload is active.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Reviewed-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      442a35a5
    • J
      ixgbe: convert to new udp_tunnel_nic infra · dc221851
      Jakub Kicinski 提交于
      Make use of new common udp_tunnel_nic infra. ixgbe supports
      IPv4 only, and only single VxLAN and Geneve ports (one each).
      
      v2:
       - split out the RXCSUM feature handling to separate change;
       - declare structs separately;
       - use ti.type instead of assuming table 0 is VxLAN;
       - move setting netdev->udp_tunnel_nic_info to its own switch.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc221851
    • J
      ixgbe: don't clear UDP tunnel ports when RXCSUM is disabled · abc0c78c
      Jakub Kicinski 提交于
      It appears the clearing of UDP tunnel ports when RXCSUM
      is disabled is unnecessary. Driver will not pay attention
      to checksum bits if RXCSUM is not set, so we can let
      the hardware parse the packets.
      
      Note that the UDP tunnel port NDO handlers don't pay attention
      to the state of RXCSUM, so the ports could had been re-programmed,
      anyway.
      
      This cleanup simplifies later conversion patch.
      
      v2:
       - break this out of the following patch.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abc0c78c
    • J
      selftests: net: add a test for UDP tunnel info infra · 91f430b2
      Jakub Kicinski 提交于
      Add validating the UDP tunnel infra works.
      
      $ ./udp_tunnel_nic.sh
      PASSED all 383 checks
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91f430b2
    • J
      netdevsim: add UDP tunnel port offload support · 424be63a
      Jakub Kicinski 提交于
      Add UDP tunnel port handlers to our fake driver so we can test
      the core infra.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      424be63a
    • J
      ethtool: add tunnel info interface · c7d759eb
      Jakub Kicinski 提交于
      Add an interface to report offloaded UDP ports via ethtool netlink.
      
      Now that core takes care of tracking which UDP tunnel ports the NICs
      are aware of we can quite easily export this information out to
      user space.
      
      The responsibility of writing the netlink dumps is split between
      ethtool code and udp_tunnel_nic.c - since udp_tunnel module may
      not always be loaded, yet we should always report the capabilities
      of the NIC.
      
      $ ethtool --show-tunnels eth0
      Tunnel information for eth0:
        UDP port table 0:
          Size: 4
          Types: vxlan
          No entries
        UDP port table 1:
          Size: 4
          Types: geneve, vxlan-gpe
          Entries (1):
              port 1230, vxlan-gpe
      
      v4:
       - back to v2, build fix is now directly in udp_tunnel.h
      v3:
       - don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET
         not set.
      v2:
       - fix string set count,
       - reorder enums in the uAPI,
       - fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset
         in docs and comments.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d759eb
    • J
      udp_tunnel: add central NIC RX port offload infrastructure · cc4e3835
      Jakub Kicinski 提交于
      Cater to devices which:
       (a) may want to sleep in the callbacks;
       (b) only have IPv4 support;
       (c) need all the programming to happen while the netdev is up.
      
      Drivers attach UDP tunnel offload info struct to their netdevs,
      where they declare how many UDP ports of various tunnel types
      they support. Core takes care of tracking which ports to offload.
      
      Use a fixed-size array since this matches what almost all drivers
      do, and avoids a complexity and uncertainty around memory allocations
      in an atomic context.
      
      Make sure that tunnel drivers don't try to replay the ports when
      new NIC netdev is registered. Automatic replays would mess up
      reference counting, and will be removed completely once all drivers
      are converted.
      
      v4:
       - use a #define NULL to avoid build issues with CONFIG_INET=n.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4e3835
    • J
      udp_tunnel: re-number the offload tunnel types · 84a4160e
      Jakub Kicinski 提交于
      Make it possible to use tunnel types as flags more easily.
      There doesn't appear to be any user using the type as an
      array index, so this should make no difference.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84a4160e
    • J
      debugfs: make sure we can remove u32_array files cleanly · a2b992c8
      Jakub Kicinski 提交于
      debugfs_create_u32_array() allocates a small structure to wrap
      the data and size information about the array. If users ever
      try to remove the file this leads to a leak since nothing ever
      frees this wrapper.
      
      That said there are no upstream users of debugfs_create_u32_array()
      that'd remove a u32 array file (we only have one u32 array user in
      CMA), so there is no real bug here.
      
      Make callers pass a wrapper they allocated. This way the lifetime
      management of the wrapper is on the caller, and we can avoid the
      potential leak in debugfs.
      
      CC: Chucheng Luo <luochucheng@vivo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2b992c8
  2. 10 7月, 2020 25 次提交