1. 18 10月, 2018 21 次提交
    • S
      octeontx2-af: CGX Rx/Tx enable/disable mbox handlers · 1435f66a
      Sunil Goutham 提交于
      Added new mailbox msgs for RVU PF/VFs to request AF
      to enable/disable their mapped CGX::LMAC Rx & Tx.
      Signed-off-by: NSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: NLinu Cherian <lcherian@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1435f66a
    • S
      octeontx2-af: Improve register polling loop · 6ca3ee2f
      Sunil Goutham 提交于
      Instead of looping on a integer timeout, use time_before(jiffies),
      so that maximum poll time is capped.
      Signed-off-by: NSunil Goutham <sgoutham@marvell.com>
      Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ca3ee2f
    • D
      Merge branch 'mlxsw-Add-VxLAN-support' · 53e50a6e
      David S. Miller 提交于
      Ido Schimmel says:
      
      ====================
      mlxsw: Add VxLAN support
      
      This patchset adds support for VxLAN offload in the mlxsw driver.
      
      With regards to the forwarding plane, VxLAN support is composed from two
      main parts: Encapsulation and decapsulation.
      
      In the device, NVE encapsulation (and VxLAN in particular) takes place
      in the bridge. A packet can be encapsulated using VxLAN either because
      it hit an FDB entry that forwards it to the router with the IP of the
      remote VTEP or because it was flooded, in which case it is sent to a
      list of remote VTEPs (in addition to local ports). In either case, the
      VNI is derived from the filtering identifier (FID) the packet was
      classified to at ingress and the underlay source IP is taken from a
      device global configuration.
      
      VxLAN decapsulation takes place in the underlay router, where packets
      that hit a local route that corresponds to the source IP of the local
      VTEP are decapsulated and injected to the bridge. The packets are
      classified to a FID based on the VNI they came with.
      
      The first six patches export the required APIs in the VxLAN and mlxsw
      drivers in order to allow for the introduction of the NVE core in the
      next two patches. The NVE core is designed to support a variety of NVE
      encapsulations (e.g., VxLAN, NVGRE) and different ASICs, but currently
      only VxLAN and Spectrum are supported. Spectrum-2 support will be added
      in the future.
      
      The last 10 patches add support for VxLAN decapsulation and
      encapsulation and include the addition of the required switchdev APIs in
      the VxLAN driver. These APIs allow capable drivers to get a notification
      about the addition / deletion of FDB entries to / from the VxLAN's FDB.
      
      Subsequent patchset will add selftests (generic and mlxsw-specific),
      data plane learning, FDB extack and vetoing and support for VLAN-aware
      bridges (one VNI per VxLAN device model).
      
      v2:
      * Implement netif_is_vxlan() using rtnl_link_ops->kind (Jakub & Stephen)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53e50a6e
    • I
      mlxsw: spectrum_switchdev: Add support for VxLAN encapsulation · 1231e04f
      Ido Schimmel 提交于
      In the device, VxLAN encapsulation takes place in the FDB table where
      certain {MAC, FID} entries are programmed with an underlay unicast IP.
      MAC addresses that are not programmed in the FDB are flooded to the
      relevant local ports and also to a list of underlay unicast IPs that are
      programmed using the all zeros MAC address in the VxLAN driver.
      
      One difference between the hardware and software data paths is the fact
      that in the software data path there are two FDB lookups prior to the
      encapsulation of the packet. First in the bridge's FDB table using {MAC,
      VID} and another in the VxLAN's FDB table using {MAC, VNI}.
      
      Therefore, when a new VxLAN FDB entry is notified, it is only programmed
      to the device if there is a corresponding entry in the bridge's FDB
      table. Similarly, when a new bridge FDB entry pointing to the VxLAN
      device is notified, it is only programmed to the device if there is a
      corresponding entry in the VxLAN's FDB table.
      
      Note that the above scheme will result in a discrepancy between both
      data paths if only one FDB table is populated in the software data path.
      For example, if only the bridge's FDB is populated with an entry
      pointing to a VxLAN device, then a packet hitting the entry will only be
      flooded by the kernel to remote VTEPs whereas the device will also flood
      the packets to other local ports member in the VLAN.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1231e04f
    • I
      mlxsw: spectrum: Enable VxLAN enslavement to bridges · 1c30d183
      Ido Schimmel 提交于
      Enslavement of VxLAN devices to offloaded bridges was never forbidden by
      mlxsw, but this patch makes sure the required configuration is performed
      in order to allow VxLAN encapsulation and decapsulation to take place in
      the device.
      
      The patch handles both the case where a VxLAN device is enslaved to an
      already offloaded bridge and the case where the first mlxsw port is
      enslaved to a bridge that already has VxLAN device configured.
      
      Invalid configurations are sanitized and an error string is returned via
      extack.
      
      Since encapsulation and decapsulation do not occur when the VxLAN device
      is down, the driver makes sure to enable / disable these functionalities
      based on NETDEV_PRE_UP and NETDEV_DOWN events.
      
      Note that NETDEV_PRE_UP is used in favor of NETDEV_UP, as the former
      allows to veto the operation, if necessary.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c30d183
    • I
      bridge: switchdev: Allow clearing FDB entry offload indication · e9ba0fbc
      Ido Schimmel 提交于
      Currently, an FDB entry only ceases being offloaded when it is deleted.
      This changes with VxLAN encapsulation.
      
      Devices capable of performing VxLAN encapsulation usually have only one
      FDB table, unlike the software data path which has two - one in the
      bridge driver and another in the VxLAN driver.
      
      Therefore, bridge FDB entries pointing to a VxLAN device are only
      offloaded if there is a corresponding entry in the VxLAN FDB.
      
      Allow clearing the offload indication in case the corresponding entry
      was deleted from the VxLAN FDB.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9ba0fbc
    • P
      vxlan: Notify for each remote of a removed FDB entry · 045a5a99
      Petr Machata 提交于
      When notifications are sent about FDB activity, and an FDB entry with
      several remotes is removed, the notification is sent only for the first
      destination. That makes it impossible to distinguish between the case
      where only this first remote is removed, and the one where the FDB entry
      is removed as a whole.
      
      Therefore send one notification for each remote of a removed FDB entry.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      045a5a99
    • P
      vxlan: Support marking RDSTs as offloaded · 0efe1173
      Petr Machata 提交于
      Offloaded bridge FDB entries are marked with NTF_OFFLOADED. Implement a
      similar mechanism for VXLAN, where a given remote destination can be
      marked as offloaded.
      
      To that end, introduce a new event, SWITCHDEV_VXLAN_FDB_OFFLOADED,
      through which the marking is communicated to the vxlan driver. To
      identify which RDST should be marked as offloaded, an
      switchdev_notifier_vxlan_fdb_info is passed to the listeners. The
      "offloaded" flag in that object determines whether the offloaded mark
      should be set or cleared.
      
      When sending offloaded FDB entries over netlink, mark them with
      NTF_OFFLOADED.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0efe1173
    • P
      vxlan: Add vxlan_fdb_find_uc() for FDB querying · 1941f1d6
      Petr Machata 提交于
      A switchdev-capable driver that is aware of VXLAN may need to query
      VXLAN FDB. In the particular case of mlxsw, this functionality is
      limited to querying UC FDBs. Those being easier to deal with than the
      general case of RDST chain traversal, introduce an interface to query
      specifically UC FDBs: vxlan_fdb_find_uc().
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1941f1d6
    • P
      vxlan: Add switchdev notifications · 9a997353
      Petr Machata 提交于
      When offloading VXLAN devices, drivers need to know about events in
      VXLAN FDB database. Since VXLAN models a bridge, it is natural to
      distribute the VXLAN FDB notifications using the pre-existing switchdev
      notification mechanism.
      
      To that end, introduce two new notification types:
      SWITCHDEV_VXLAN_FDB_ADD_TO_DEVICE and SWITCHDEV_VXLAN_FDB_DEL_TO_DEVICE.
      Introduce a new function, vxlan_fdb_switchdev_call_notifiers() to send
      the new notifier types, and a struct switchdev_notifier_vxlan_fdb_info
      to communicate the details of the FDB entry under consideration.
      
      Invoke the new function from vxlan_fdb_notify().
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a997353
    • I
      net: Add netif_is_vxlan() · 5ff4ff4f
      Ido Schimmel 提交于
      Add the ability to determine whether a netdev is a VxLAN netdev by
      calling the above mentioned function that checks the netdev's
      rtnl_link_ops.
      
      This will allow modules to identify netdev events involving a VxLAN
      netdev and act accordingly. For example, drivers capable of VxLAN
      offload will need to configure the underlying device when a VxLAN netdev
      is being enslaved to an offloaded bridge.
      
      Convert nfp to use the newly introduced helper.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ff4ff4f
    • I
      mlxsw: spectrum_router: Configure matching local routes for NVE decap · 4cf178d7
      Ido Schimmel 提交于
      When a local route that matches the source IP of an offloaded NVE tunnel
      is notified, the driver needs to program it to perform NVE decapsulation
      instead of merely trapping packets to the CPU.
      
      This patch complements "mlxsw: spectrum_router: Enable local routes
      promotion to perform NVE decap" where existing local routes were
      promoted to perform NVE decapsulation.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cf178d7
    • I
      mlxsw: spectrum_fid: Clear NVE configuration when destroying 802.1D FIDs · 498790be
      Ido Schimmel 提交于
      802.1D FIDs are used to represent VLAN-unaware bridges and currently
      this is the only type of FID that supports NVE configuration.
      
      Since the NVE tunnel device does not take a reference on the FID, it is
      possible for the FID to be destroyed when it still has NVE
      configuration.
      
      Therefore, when destroying the FID make sure to disable its NVE
      configuration.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      498790be
    • I
      mlxsw: spectrum_nve: Implement VxLAN operations · 36952911
      Ido Schimmel 提交于
      The common NVE core expects each encapsulation type to implement a
      certain set of operations that are specific to this type and the
      currently used ASIC. These operations include things such as the ability
      to determine whether a certain NVE configuration can be offloaded and
      ASIC-specific initialization for this type.
      
      Implement these operations for VxLAN on the Spectrum ASIC. Spectrum-2
      support will be added by a future patchset.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36952911
    • I
      mlxsw: spectrum_nve: Implement common NVE core · 6e6030bd
      Ido Schimmel 提交于
      The Spectrum ASIC supports different types of NVE encapsulations (e.g.,
      VxLAN, NVGRE) with more types to be supported by future ASICs.
      
      Despite being different, all these encapsulations share some common
      functionality such as the enablement of NVE encapsulation on a given
      filtering identifier (FID) and the addition of remote VTEPs to the
      linked-list of VTEPs that traffic should be flooded to.
      
      Implement this common core and allow different ASICs to register
      different operations for different encapsulation types.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e6030bd
    • I
      inet: Refactor INET_ECN_decapsulate() · 28e45033
      Ido Schimmel 提交于
      Drivers that support tunnel decapsulation (IPinIP or NVE) need to
      configure the underlying device to conform to the behavior outlined in
      RFC 6040 with respect to the ECN bits.
      
      This behavior is implemented by INET_ECN_decapsulate() which requires an
      skb to be passed where the ECN CE bit can be potentially set. Since
      these drivers do not need to mark an skb, but only configure the device
      to do so, factor out the business logic to __INET_ECN_decapsulate() and
      potentially perform the marking in INET_ECN_decapsulate().
      
      This allows drivers to invoke __INET_ECN_decapsulate() and configure the
      device.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Suggested-by: NPetr Machata <petrm@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28e45033
    • I
      vxlan: Export address checking functions · cca45e05
      Ido Schimmel 提交于
      Drivers that support VxLAN offload need to be able to sanitize the
      configuration of the VxLAN device and accept / reject its offload.
      
      For example, mlxsw requires that the local IP of the VxLAN device be set
      and that packets be flooded to unicast IP(s) and not to a multicast
      group.
      
      Expose the functions that perform such checks.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cca45e05
    • I
      mlxsw: spectrum_router: Allow querying VR ID based on table ID · 88782f75
      Ido Schimmel 提交于
      In the device, different VRFs (routing tables) are represented using
      different virtual routers (VRs) and thus the kernel's table IDs are
      mapped to VR IDs.
      
      Allow internal users of the IP router to query the VR ID based on a
      kernel table ID.
      
      This is needed - for example - when configuring the underlay VR where
      VxLAN encapsulated packets will undergo an L3 lookup. In this case, the
      kernel's table ID is derived from the VxLAN device's configuration.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88782f75
    • I
      mlxsw: spectrum_router: Enable local routes promotion to perform NVE decap · 0c69e0fc
      Ido Schimmel 提交于
      When an NVE tunnel with an IP underlay (e.g., VxLAN) is configured the
      local route to the tunnel's source IP needs to be promoted to perform
      NVE decapsulation.
      
      Expose an API in the unicast IP router to promote / demote local routes.
      
      The case where a local route is configured after the creation of the NVE
      tunnel will be handled in a subsequent patch in the set.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c69e0fc
    • I
      mlxsw: spectrum_fid: Add APIs to lookup FID without creating it · 564c6d72
      Ido Schimmel 提交于
      Current APIs only allow looking for a FID and creating it in case it
      does not exist.
      
      With VxLAN, in case the bridge to which the VxLAN device was enslaved
      does not already have a corresponding FID, then it means that something
      went wrong that we need to be aware of.
      
      Add an API to look up a FID, but without creating it in order to catch
      above-mentioned situation.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      564c6d72
    • I
      mlxsw: spectrum_fid: Allow setting and clearing NVE properties on FID · d3d19d4b
      Ido Schimmel 提交于
      In the device, the VNI and the list of remote VTEPs a packet should be
      flooded to is a property of the filtering identifier (FID).
      
      During encapsulation, the VNI is taken from the FID the packet was
      classified to. During decapsulation, the overlay packet is injected into
      a bridge and classified to a FID based on the VNI it came with.
      
      Allow NVE configuration for a FID. Currently, this is only supported
      with 802.1D FIDs which are used for VLAN-unaware bridges. However, NVE
      configuration is going to be supported with 802.1Q FIDs which is why the
      related fields are placed in the common FID struct.
      
      Since the device requires a 1:1 mapping between FID and VNI, the driver
      maintains a hashtable keyed by VNI and checks if the VNI is already
      associated with an existing FID.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3d19d4b
  2. 17 10月, 2018 14 次提交
  3. 16 10月, 2018 5 次提交
    • D
      Merge branch 'net-Kernel-side-filtering-for-route-dumps' · 2c59f06c
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: Kernel side filtering for route dumps
      
      Implement kernel side filtering of route dumps by protocol (e.g., which
      routing daemon installed the route), route type (e.g., unicast), table
      id and nexthop device.
      
      iproute2 has been doing this filtering in userspace for years; pushing
      the filters to the kernel side reduces the amount of data the kernel
      sends and reduces wasted cycles on both sides processing unwanted data.
      These initial options provide a huge improvement for efficiently
      examining routes on large scale systems.
      
      v2
      - better handling of requests for a specific table. Rather than walking
        the hash of all tables, lookup the specific table and dump it
      - refactor mr_rtm_dumproute moving the loop over the table into a
        helper that can be invoked directly
      - add hook to return NLM_F_DUMP_FILTERED in DONE message to ensure
        it is returned even when the dump returns nothing
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c59f06c
    • D
      net/ipv4: Bail early if user only wants prefix entries · e4e92fb1
      David Ahern 提交于
      Unlike IPv6, IPv4 does not have routes marked with RTF_PREFIX_RT. If the
      flag is set in the dump request, just return.
      
      In the process of this change, move the CLONE check to use the new
      filter flags.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4e92fb1
    • D
      net/ipv6: Bail early if user only wants cloned entries · 08e814c9
      David Ahern 提交于
      Similar to IPv4, IPv6 fib no longer contains cloned routes. If a user
      requests a route dump for only cloned entries, no sense walking the FIB
      and returning everything.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08e814c9
    • D
      net/mpls: Handle kernel side filtering of route dumps · 196cfebf
      David Ahern 提交于
      Update the dump request parsing in MPLS for the non-INET case to
      enable kernel side filtering. If INET is disabled the only filters
      that make sense for MPLS are protocol and nexthop device.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      196cfebf
    • D
      net: Enable kernel side filtering of route dumps · effe6792
      David Ahern 提交于
      Update parsing of route dump request to enable kernel side filtering.
      Allow filtering results by protocol (e.g., which routing daemon installed
      the route), route type (e.g., unicast), table id and nexthop device. These
      amount to the low hanging fruit, yet a huge improvement, for dumping
      routes.
      
      ip_valid_fib_dump_req is called with RTNL held, so __dev_get_by_index can
      be used to look up the device index without taking a reference. From
      there filter->dev is only used during dump loops with the lock still held.
      
      Set NLM_F_DUMP_FILTERED in the answer_flags so the user knows the results
      have been filtered should no entries be returned.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      effe6792