1. 17 9月, 2020 4 次提交
    • P
      mlxsw: spectrum: Track priorities in struct mlxsw_sp_hdroom · 5df825ed
      Petr Machata 提交于
      The mapping from priorities to buffers determines which buffers should be
      configured. Lossiness of these priorities combined with the mapping
      determines whether a given buffer should be lossy.
      
      Currently this configuration is stored implicitly in DCB ETS, PFC and
      ethtool PAUSE configuration. Keeping it together with the rest of the
      headroom configuration and deriving it as needed from PFC / ETS / PAUSE
      will make things clearer. To that end, add a field "prios" to struct
      mlxsw_sp_hdroom.
      
      Previously, __mlxsw_sp_port_headroom_set() took prio_tc as an argument, and
      assumed that the same mapping as we use on the egress should be used on
      ingress as well. Instead, track this configuration at each priority, so
      that it can be adjusted flexibly.
      
      In the following patches, as dcbnl_setbuffer is implemented, it will need
      to store its own mapping, and it will also be sometimes necessary to revert
      back to the original ETS mapping. Therefore track two buffer indices: the
      one for chip configuration (buf_idx), and the source one (ets_buf_idx).
      Introduce a function to configure the chip-level buffer index, and for now
      have it simply copy the ETS mapping over to the chip mapping.
      
      Update the ETS handler to project prio_tc to the ets_buf_idx and invoke the
      buf_idx recomputation.
      
      Now that there is a canonical place to look for this configuration,
      mlxsw_sp_port_headroom_set() does not need to invent def_prio_tc to use if
      DCB is compiled out.
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5df825ed
    • P
      mlxsw: spectrum: Track MTU in struct mlxsw_sp_hdroom · 0103a3e4
      Petr Machata 提交于
      MTU influences sizes of auto-allocated buffers. Make it a part of port
      buffer configuration and have __mlxsw_sp_port_headroom_set() take it from
      there, instead of as an argument.
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0103a3e4
    • P
      mlxsw: spectrum: Unify delay handling between PFC and pause · b7e07bbd
      Petr Machata 提交于
      When a priority is marked as lossless using DCB PFC, or when pause frames
      are enabled on a port, mlxsw adds to port buffers an extra space to cover
      the traffic that will arrive between the time that a pause or PFC frame is
      emitted, and the time traffic actually stops. This is called the delay. The
      concept is the same in PFC and pause, however the way the extra buffer
      space is calculated differs.
      
      In this patch, unify this handling. Delay is to be measured in bytes of
      extra space, and will not include MTU. PFC handler sets the delay directly
      from the parameter it gets through the DCB interface.
      
      To convert pause handler, move MLXSW_SP_PAUSE_DELAY to ethtool module,
      convert to bytes, and reduce it by maximum MTU, and divide by two. Then it
      has the same meaning as the delay_bytes set by the PFC handler.
      
      Keep the delay_bytes value in struct mlxsw_sp_hdroom introduced in the
      previous patch. Change PFC and pause handlers to store the new delay value
      there and have __mlxsw_sp_port_headroom_set() take it from there.
      
      Instead of mlxsw_sp_pfc_delay_get() and mlxsw_sp_pg_buf_delay_get(),
      introduce mlxsw_sp_hdroom_buf_delay_get() to calculate the delay provision.
      Drop the unnecessary MLXSW_SP_CELL_FACTOR, and instead add an explanatory
      comment describing the formula used.
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7e07bbd
    • P
      mlxsw: spectrum_buffers: Add struct mlxsw_sp_hdroom · 3a77f5a2
      Petr Machata 提交于
      The port headroom handling is currently strewn across several modules and
      tricky to follow: MTU, DCB PFC, DCB ETS and ethtool pause all influence the
      settings, and then there is the completely separate initial configuraion in
      spectrum_buffers. A following patch will implement the dcbnl_setbuffer
      callback, which is going to further complicate the landscape.
      
      In order to simplify work with port buffers, the following patches are
      going to centralize all port-buffer handling in spectrum_buffers. As a
      first step, introduce a (currently empty) struct mlxsw_sp_hdroom that will
      keep the configuration parameters, and allocate and free it in appropriate
      places.
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a77f5a2
  2. 16 9月, 2020 25 次提交
  3. 15 9月, 2020 6 次提交
  4. 13 9月, 2020 1 次提交
  5. 11 9月, 2020 1 次提交
    • J
      mlx4: make sure to always set the port type · 0313c7c2
      Jakub Kicinski 提交于
      Even tho mlx4_core registers the devlink ports, it's mlx4_en
      and mlx4_ib which set their type. In situations where one of
      the two is not built yet the machine has ports of given type
      we see the devlink warning from devlink_port_type_warn() trigger.
      
      Having ports of a type not supported by the kernel may seem
      surprising, but it does occur in practice - when the unsupported
      port is not plugged in to a switch anyway users are more than happy
      not to see it (and potentially allocate any resources to it).
      
      Set the type in mlx4_core if type-specific driver is not built.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0313c7c2
  6. 10 9月, 2020 3 次提交
    • P
      devlink: Introduce controller number · 3a2d9588
      Parav Pandit 提交于
      A devlink port may be for a controller consist of PCI device.
      A devlink instance holds ports of two types of controllers.
      (1) controller discovered on same system where eswitch resides
      This is the case where PCI PF/VF of a controller and devlink eswitch
      instance both are located on a single system.
      (2) controller located on external host system.
      This is the case where a controller is located in one system and its
      devlink eswitch ports are located in a different system.
      
      When a devlink eswitch instance serves the devlink ports of both
      controllers together, PCI PF/VF numbers may overlap.
      Due to this a unique phys_port_name cannot be constructed.
      
      For example in below such system controller-0 and controller-1, each has
      PCI PF pf0 whose eswitch ports can be present in controller-0.
      These results in phys_port_name as "pf0" for both.
      Similar problem exists for VFs and upcoming Sub functions.
      
      An example view of two controller systems:
      
                   ---------------------------------------------------------
                   |                                                       |
                   |           --------- ---------         ------- ------- |
      -----------  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
      | server  |  | -------   ----/---- ---/----- ------- ---/--- ---/--- |
      | pci rc  |=== | pf0 |______/________/       | pf1 |___/_______/     |
      | connect |  | -------                       -------                 |
      -----------  |     | controller_num=1 (no eswitch)                   |
                   ------|--------------------------------------------------
                   (internal wire)
                         |
                   ---------------------------------------------------------
                   | devlink eswitch ports and reps                        |
                   | ----------------------------------------------------- |
                   | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
                   | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
                   | ----------------------------------------------------- |
                   | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
                   | |pf1    | pf1vfN | pf1sfN | pf1    | pf1vfN |pf0sfN | |
                   | ----------------------------------------------------- |
                   |                                                       |
                   |                                                       |
                   |           --------- ---------         ------- ------- |
                   |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
                   | -------   ----/---- ---/----- ------- ---/--- ---/--- |
                   | | pf0 |______/________/       | pf1 |___/_______/     |
                   | -------                       -------                 |
                   |                                                       |
                   |  local controller_num=0 (eswitch)                     |
                   ---------------------------------------------------------
      
      An example devlink port for external controller with controller
      number = 1 for a VF 1 of PF 0:
      
      $ devlink port show pci/0000:06:00.0/2
      pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
        function:
          hw_addr 00:00:00:00:00:00
      
      $ devlink port show pci/0000:06:00.0/2 -jp
      {
          "port": {
              "pci/0000:06:00.0/2": {
                  "type": "eth",
                  "netdev": "ens2f0pf0vf1",
                  "flavour": "pcivf",
                  "controller": 1,
                  "pfnum": 0,
                  "vfnum": 1,
                  "external": true,
                  "splittable": false,
                  "function": {
                      "hw_addr": "00:00:00:00:00:00"
                  }
              }
          }
      }
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a2d9588
    • P
      devlink: Introduce external controller flag · 05b595e9
      Parav Pandit 提交于
      A devlink eswitch port may represent PCI PF/VF ports of a controller.
      
      A controller either located on same system or it can be an external
      controller located in host where such NIC is plugged in.
      
      Add the ability for driver to specify if a port is for external
      controller.
      
      Use such flag in the mlx5_core driver.
      
      An example of an external controller having VF1 of PF0 belong to
      controller 1.
      
      $ devlink port show pci/0000:06:00.0/2
      pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
        function:
          hw_addr 00:00:00:00:00:00
      $ devlink port show pci/0000:06:00.0/2 -jp
      {
          "port": {
              "pci/0000:06:00.0/2": {
                  "type": "eth",
                  "netdev": "ens2f0pf0vf1",
                  "flavour": "pcivf",
                  "pfnum": 0,
                  "vfnum": 1,
                  "external": true,
                  "splittable": false,
                  "function": {
                      "hw_addr": "00:00:00:00:00:00"
                  }
              }
          }
      }
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05b595e9
    • P
      net/mlx5: E-switch, Read controller number from device · a53cf949
      Parav Pandit 提交于
      ECPF supports one external host controller. Read controller number
      from the device.
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a53cf949