1. 15 4月, 2016 2 次提交
  2. 09 4月, 2016 3 次提交
  3. 07 4月, 2016 8 次提交
    • I
      mlxsw: spectrum: Add IEEE 802.1Qbb PFC support · d81a6bdb
      Ido Schimmel 提交于
      Implement the appropriate DCB ops and allow a user to configure certain
      traffic classes as lossless.
      
      The operation configures PFC for both the egress (respecting PFC frames)
      and ingress (sending PFC frames) parts of the port.
      
      At egress, when a PFC frame is received for a PFC enabled priority, then
      all the priorities mapped to the same TC are stopped.
      
      At ingress, the priority group (PG) buffers to which the enabled PFC
      priorities are mapped are configured to be lossless. PFC frames will be
      transmitted when the Xoff threshold is crossed.
      
      The user-supplied delay parameter is used to determine the PG's size
      according to the following formula:
      
      PG_SIZE = PG_SIZE_LOSSY + delay * CELL_FACTOR + MTU
      
      In the worst case scenario the delay will be made up of packets that
      are all of size CELL_SIZE + 1, which means each packet will require
      almost twice its true size when buffered in the switch. We therefore
      multiply this value by the "cell factor", which is close to 2.
      
      Another MTU is added in case the transmitting host already started
      transmitting a maximum length frame when the PFC packet was received.
      
      As with PAUSE enabled ports, when the port's MTU is changed both the
      PGs' size and threshold are adjusted accordingly.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d81a6bdb
    • I
      mlxsw: reg: Introduce per priority counters · 34dba0a5
      Ido Schimmel 提交于
      We are going to add support for PFC as part of DCB ops, which requires us
      to report the number of PFC frames sent and received per priority.
      
      Add per priority counters in order to report number of PFC frames sent
      and received per priority.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34dba0a5
    • I
      mlxsw: spectrum: Add support for PAUSE frames · 9f7ec052
      Ido Schimmel 提交于
      When a packet ingress the switch it's placed in its assigned priority
      group (PG) buffer in the port's headroom buffer while it goes through
      the switch's pipeline. After going through the pipeline - which
      determines its egress port(s) and traffic class - it's moved to the
      switch's shared buffer awaiting transmission.
      
      However, some packets are not eligible to enter the shared buffer due to
      exceeded quotas or insufficient space. Marking their associated PGs as
      lossless will cause the packets to accumulate in the PG buffer. Another
      reason for packets accumulation are complicated pipelines (e.g.
      involving a lot of ACLs).
      
      To prevent packets from being dropped a user can enable PAUSE frames on
      the port. This will mark all the active PGs as lossless and set their
      size according to the maximum delay, as it's not configured by user.
      
                               +----------------+   +
                               |                |   |
                               |                |   |
                               |                |   |
                               |                |   |
                               |                |   |
                               |                |   | Delay
                               |                |   |
                               |                |   |
                               |                |   |
                               |                |   |
                               |                |   |
          Xon/Xoff threshold   +----------------+   +
                               |                |   |
                               |                |   | 2 * MTU
                               |                |   |
                               +----------------+   +
      
      The delay (612 [Cells]) was calculated according to worst-case scenario
      involving maximum MTU and 100m cables.
      
      After marking the PGs as lossless the device is configured to respect
      incoming PAUSE frames (Rx PAUSE) and generate PAUSE frames (Tx PAUSE)
      according to user's settings.
      
      Whenever the port's headroom configuration changes we take into account
      the PAUSE configuration, so that we correctly set the PG's type (lossy /
      lossless), size and threshold. This can happen when:
      
      a) The port's MTU changes, as it directly affects the PG's size.
      
      b) A PG is created following user configuration, by binding a priority
      to it.
      
      Note that the relevant SUPPORTED flags were already mistakenly set by
      the driver before this commit.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f7ec052
    • I
      mlxsw: spectrum: Allow setting maximum rate for a TC · cc7cf517
      Ido Schimmel 提交于
      Allow a user to set maximum rate for a particular TC using DCB ops.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc7cf517
    • I
      mlxsw: spectrum: Add IEEE 802.1Qaz ETS support · 8e8dfe9f
      Ido Schimmel 提交于
      Implement the appropriate DCB ops and allow a user to configure:
      	* Priority to traffic class (TC) mapping with a total of 8
      	  supported TCs
      	* Transmission selection algorithm (TSA) for each TC and the
      	  corresponding weights in case of weighted round robin (WRR)
      
      As previously explained, we treat the priority group (PG) buffer in the
      port's headroom as the ingress counterpart of the egress TC. Therefore,
      when a certain priority to TC mapping is configured, we also configure
      the port's headroom buffer.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e8dfe9f
    • I
      mlxsw: spectrum: Introduce support for Data Center Bridging (DCB) · f00817df
      Ido Schimmel 提交于
      Introduce basic infrastructure for DCB and add the missing ops in
      following patches.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f00817df
    • I
      mlxsw: spectrum: Initialize egress scheduling · 90183b98
      Ido Schimmel 提交于
      Before introducing support for DCB ops we should first make sure we
      initialize the relevant parts in the device correctly. Specifically, the
      egress scheduling.
      
      The device supports a superset of the 802.1Qaz standard with 4 hierarchy
      levels that can be linked to each other in multiple ways and with
      different transmission selection algorithms (TSA) employed between them.
      
      However, since we only intend to support the 802.1Qaz standard we
      flatten the hierarchies and let the user configure via DCB ops the TSA
      and max rate shaper at the subgroup hierarchy (see figure below) and the
      mapping between switch priority to traffic class. By default, all switch
      priorities are mapped to traffic class 0, strict priority is employed
      and max shaper is disabled.
      
      Default configuration:
      
               switch priority 0      ...         switch priority 7
                       +                                  +
                       |                                  |
                       +----------------------------------+
                       |
                    +--v--+                          +-----+
      Traffic Class |     |                          |     |
        Hierarchy   | TC0 |           ...            | TC7 |
                    |     |                          |     |
                    +--+--+                          +--+--+
                       |                                |
                    +--v--+                          +--v--+
        Subgroup    | SG0 |                          | SG7 |
        Hierarchy   |     |                          |     |
                    +-----+                          +-----+
                    | TSA |                          | TSA |
                    +-----+           ...            +-----+
                    | MAX |                          | MAX |
                    +--+--+                          +--+--+
                       |                                |
                       +---------------+----------------+
                                       |
                                    +--v--+
                            Group   |     |
                          Hierarchy | GR0 |
                                    |     |
                                    +--+--+
                                       |
                                    +--v--+
                            Port    |     |
                          Hierarchy | PR0 |
                                    |     |
                                    +-----+
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90183b98
    • I
      mlxsw: spectrum: Correctly configure headroom size · ff6551ec
      Ido Schimmel 提交于
      When packets ingress the switch they are assigned a switch priority and
      directed to the corresponding priority group (PG) buffer in the port's
      headroom buffer.
      
      Since we now map all switch priorities to priority group 0 (PG0) by
      default, there is no need to allocate the other priority groups during
      initialization. The only exception is PG9, which is used for control
      traffic.
      
      At minimum, the PG should be able to store the currently classified
      packet (pipeline latency isn't 0) and also the packets arriving during
      the classification time. However, an incoming packet will not be
      buffered if there is no available MTU-sized buffer space for storing it.
      
      The buffer needed to accommodate for pipeline latency is variable and
      needs to take into account both the current link speed and current
      latency of the pipeline, which is time-dependent. Testing showed that
      setting the PG's size to twice the current MTU is optimal.
      
      Since PG9 is used strictly for control packets and not subject to flow
      control, we are not going to resize it according to user configuration,
      so we simply set it according to worst case scenario, which is twice the
      maximum MTU.
      
      In any case, later patches in the series will allow a user to direct
      lossless flows to other PGs than PG0 and set their size to accommodate
      for round-trip propagation delay.
      
      The above change also requires us to resize the PG buffer whenever the
      port's MTU is changed.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff6551ec
  4. 06 4月, 2016 1 次提交
  5. 08 3月, 2016 1 次提交
  6. 02 3月, 2016 5 次提交
  7. 18 2月, 2016 2 次提交
  8. 29 1月, 2016 2 次提交
    • I
      mlxsw: spectrum: Flush FDB when leaving bridge · 039c49a6
      Ido Schimmel 提交于
      As explained in previous commit, we should always take care of flushing
      the FDB in the driver and not rely on bridge code.
      
      We need to distinguish between two cases with regards to LAG:
      
      1) Port is leaving LAG while LAG is bridged (or VLAN devices on top of
      it). In this case don't flush the FDB entries pointing to the LAG ID, as
      this will affect other ports still member in the LAG. Only flush the FDB
      when the last port in the LAG is leaving the bridge.
      
      2) LAG device is leaving the bridge. In this case the CHANGEUPPER event
      is simply propagated to each member port, so make each port flush the
      FDB in its turn.
      
      Note that emptying a bridged LAG from ports creates an inconsistency
      between hardware and software. A user who later (< ageing_time)
      re-populates the LAG won't have any FDB entries pointing to the LAG ID
      in hardware, but they will be present in the software bridge's FDB.
      Currently there is no good solution to this problem, but this will be
      addressed by us in the future.
      
      In order to optimize the flushing process, flush by port or LAG ID if
      there are no VLAN interfaces on top of the port. Otherwise, flush using
      (Port / LAG ID, FID=VID} for each of the lower 4K FIDs. In the case of
      VLAN device simply flush using {Port / LAG ID, vFID} with the vFID to
      which the VLAN device is mapped to.
      
      Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      039c49a6
    • I
      mlxsw: spectrum: Handle port leaving LAG while bridged · 4dc236c3
      Ido Schimmel 提交于
      It is possible for a user to remove a port from a LAG device, while the
      LAG device or VLAN devices on top of it are bridged. In these cases,
      bridge's teardown sequence is never issued, so we need to take care of
      it ourselves.
      
      When LAG's unlinking event is received by port netdev:
      
      1) Traverse its vPorts list and make those member in a bridge leave it.
         They will be deleted later by LAG code.
      
      2) Make the port netdev itself leave its bridge if member in one.
      
      Fixes: 0d65fc13 ("mlxsw: spectrum: Implement LAG port join/leave")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dc236c3
  9. 11 1月, 2016 2 次提交
  10. 07 1月, 2016 1 次提交
  11. 05 1月, 2016 4 次提交
  12. 16 12月, 2015 6 次提交
    • I
      mlxsw: spectrum: Add support for VLAN devices on top of LAG · 272c4470
      Ido Schimmel 提交于
      When creating a VLAN device on top of LAG, we are basically creating a
      vPort on top of each of the port netdevs member in the LAG. Therefore,
      these vPorts should inherit both the LAG status and LAG ID from the
      underlying port netdevs.
      
      In addition, when the VLAN device joins or leaves a bridge each of the
      underlying vPorts should know about it and act accordingly. This is
      achieved by propagating the VLAN event down to the lower devices.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      272c4470
    • I
      mlxsw: spectrum: Add support for VLAN devices bridging · 26f0e7fb
      Ido Schimmel 提交于
      All the member VLAN devices in a bridge need to share the same vFID.
      
      To achieve that, expand the vFID struct to include the associated bridge
      device (or lack of) and allow one to lookup a vFID based on a bridge
      device.
      
      When joining a bridge, lookup the relevant vFID or create one if none
      exists. Next, make the VLAN device use the vFID.
      
      Leaving a bridge can either occur because a user removed the VLAN device
      from a bridge or because the VLAN device was deleted by the user. In the
      latter case the bridge's teardown sequence is invoked after the hardware
      vPort is already gone. Therefore, when unlinking the VLAN device from
      the real device, check if the associated vPort is bridged and act
      accordingly. The bridge's notification will be ignored in this case.
      
      Note that bridging a VLAN interface with an ordinary port netdev is
      currently not supported, but not forbidden. This will be addressed in a
      follow-up patchset.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26f0e7fb
    • I
      mlxsw: spectrum: Handle VLAN devices linking / unlinking · 9589a7b5
      Ido Schimmel 提交于
      When a VLAN interface is configured on top of a physical port we should
      associate the VLAN device with the matching vPort. Likewise, when it's
      removed, we should revert back to the underlying port netdev.
      
      While not a must, this is consistent with port netdevs and also provides
      a more accurate error printing via netdev_err() and friends.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9589a7b5
    • I
      mlxsw: spectrum: Add another flood table for vFIDs · 19ae6124
      Ido Schimmel 提交于
      We previously used only one flood table for packets classified to vFIDs.
      However, since we are going to add support for bridges between VLAN
      interfaces (mapped to vFIDs) we need to add one more flood table.
      
      That way we can separate the flooding domain of unknown unicast traffic
      from all the rest and support flood control (as we do with the 802.1Q
      bridge).
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19ae6124
    • I
      mlxsw: spectrum: Split vFID range in two · 7f71eb46
      Ido Schimmel 提交于
      Up until now we used a 1:1 mapping - based on VID - to map a VLAN
      interface to a vFID. However, a different scheme is needed in order to
      support bridges between VLAN interfaces, as all the member interfaces -
      which can have different VIDs - need to share the same vFID.
      
      Solve that by splitting the vFID range in two:
       1. Non-bridged VLAN interfaces
       2. Bridged VLAN interfaces
      
      When a VLAN interface is created, assign it the next available vFID in
      the first range, unless one already exists for that VID or number of
      vFIDs in the range was exceeded. When interface is removed, free the
      vFID, unless other interfaces are mapped to it.
      
      To accomplish the above:
       1. Store the VID to vFID mapping in a new struct (mlxsw_sp_vfid), which
          has a global context and holds a reference count.
       2. Create a vPort (dummy in case of bridge SELF invocation) on top of
          of the physical port and hold a reference to the associated vFID.
      
      	     vfid                    vfid
      	+-------------+	        +-------------+
      	| vfid        |         | vfid        |
      	| vid         +---> ... | vid         |
      	| nr_vports   |         | nr_vports   |
      	+------+------+         +------+------+
      				       |
      	       +-----------------------+-------+
      	       |			       |
      	     vport			     vport
      	+-------------+         	+-------------+
      	| ...	      |         	| ...	      |
      	| *vfid	      +---> ... 	| *vfid	      +---> ...
      	| ...	      |         	| ...	      |
      	+------+------+         	+------+------+
      	       |                               |
      	     port			     port
      	+-------------+         	+-------------+
      	| ...         |         	| ...         |
      	| vports_list |         	| vports_list |
      	| ...         |         	| ...         |
      	+-------------+         	+-------------+
      	     swXpY			     swXpZ
      
      Next patches in the series will add the missing infrastructure for the
      second range and transfer vPorts between the two ranges according to the
      received notifications.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f71eb46
    • I
      mlxsw: spectrum: Allocate active VLANs only for port netdevs · bd40e9d6
      Ido Schimmel 提交于
      When adding support for bridges between VLAN interfaces, we'll introduce
      a new entity called a vPort, which is a represntation of the VLAN
      interface in the hardware.
      
      The main difference between a vPort and a physical port is that several
      FIDs can be bound to the latter, whereas only one (called a vFID) can be
      bound to the first.
      
      Therefore, it makes sense to use the same struct to represent the two,
      but to only allocate the 'active_vlans' bitmap in case of a physical
      port.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd40e9d6
  13. 12 12月, 2015 1 次提交
  14. 04 12月, 2015 2 次提交