1. 12 5月, 2020 5 次提交
  2. 11 5月, 2020 9 次提交
    • V
      net: dsa: sja1105: implement cross-chip bridging operations · ac02a451
      Vladimir Oltean 提交于
      sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
      and which is compatible with cascading. A complete description of this
      tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
      each external-facing port tags incoming frames with a unique pvid, and
      this special VLAN is transmitted as tagged towards the inside of the
      system, and as untagged towards the exterior. The tag encodes the switch
      id and the source port index.
      
      This means that cross-chip bridging for dsa_8021q only entails adding
      the dsa_8021q pvids of one switch to the RX filter of the other
      switches. Everything else falls naturally into place, as long as the
      bottom-end of ports (the leaves in the tree) is comprised exclusively of
      dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
      a chance that a front-panel switch transmits a packet tagged with a
      dsa_8021q header, header which it wouldn't be able to remove, and which
      would hence "leak" out.
      
      The only use case I tested (due to lack of board availability) was when
      the sja1105 switches are part of disjoint trees (however, this doesn't
      change the fact that multiple sja1105 switches still need unique switch
      identifiers in such a system). But in principle, even "true" single-tree
      setups (with DSA links) should work just as fine, except for a small
      change which I can't test: dsa_towards_port should be used instead of
      dsa_upstream_port (I made the assumption that the routing port that any
      sja1105 should use towards its neighbours is the CPU port. That might
      not hold true in other setups).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ac02a451
    • V
      net: dsa: introduce a dsa_switch_find function · 3b7bc1f0
      Vladimir Oltean 提交于
      Somewhat similar to dsa_tree_find, dsa_switch_find returns a dsa_switch
      structure pointer by searching for its tree index and switch index (the
      parameters from dsa,member). To be used, for example, by drivers who
      implement .crosschip_bridge_join and need a reference to the other
      switch indicated to by the tree_index and sw_index arguments.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      3b7bc1f0
    • V
      net: dsa: permit cross-chip bridging between all trees in the system · f66a6a69
      Vladimir Oltean 提交于
      One way of utilizing DSA is by cascading switches which do not all have
      compatible taggers. Consider the following real-life topology:
      
            +---------------------------------------------------------------+
            | LS1028A                                                       |
            |               +------------------------------+                |
            |               |      DSA master for Felix    |                |
            |               |(internal ENETC port 2: eno2))|                |
            |  +------------+------------------------------+-------------+  |
            |  | Felix embedded L2 switch                                |  |
            |  |                                                         |  |
            |  | +--------------+   +--------------+   +--------------+  |  |
            |  | |DSA master for|   |DSA master for|   |DSA master for|  |  |
            |  | |  SJA1105 1   |   |  SJA1105 2   |   |  SJA1105 3   |  |  |
            |  | |(Felix port 1)|   |(Felix port 2)|   |(Felix port 3)|  |  |
            +--+-+--------------+---+--------------+---+--------------+--+--+
      
      +-----------------------+ +-----------------------+ +-----------------------+
      |   SJA1105 switch 1    | |   SJA1105 switch 2    | |   SJA1105 switch 3    |
      +-----+-----+-----+-----+ +-----+-----+-----+-----+ +-----+-----+-----+-----+
      |sw1p0|sw1p1|sw1p2|sw1p3| |sw2p0|sw2p1|sw2p2|sw2p3| |sw3p0|sw3p1|sw3p2|sw3p3|
      +-----+-----+-----+-----+ +-----+-----+-----+-----+ +-----+-----+-----+-----+
      
      The above can be described in the device tree as follows (obviously not
      complete):
      
      mscc_felix {
      	dsa,member = <0 0>;
      	ports {
      		port@4 {
      			ethernet = <&enetc_port2>;
      		};
      	};
      };
      
      sja1105_switch1 {
      	dsa,member = <1 1>;
      	ports {
      		port@4 {
      			ethernet = <&mscc_felix_port1>;
      		};
      	};
      };
      
      sja1105_switch2 {
      	dsa,member = <2 2>;
      	ports {
      		port@4 {
      			ethernet = <&mscc_felix_port2>;
      		};
      	};
      };
      
      sja1105_switch3 {
      	dsa,member = <3 3>;
      	ports {
      		port@4 {
      			ethernet = <&mscc_felix_port3>;
      		};
      	};
      };
      
      Basically we instantiate one DSA switch tree for every hardware switch
      in the system, but we still give them globally unique switch IDs (will
      come back to that later). Having 3 disjoint switch trees makes the
      tagger drivers "just work", because net devices are registered for the
      3 Felix DSA master ports, and they are also DSA slave ports to the ENETC
      port. So packets received on the ENETC port are stripped of their
      stacked DSA tags one by one.
      
      Currently, hardware bridging between ports on the same sja1105 chip is
      possible, but switching between sja1105 ports on different chips is
      handled by the software bridge. This is fine, but we can do better.
      
      In fact, the dsa_8021q tag used by sja1105 is compatible with cascading.
      In other words, a sja1105 switch can correctly parse and route a packet
      containing a dsa_8021q tag. So if we could enable hardware bridging on
      the Felix DSA master ports, cross-chip bridging could be completely
      offloaded.
      
      Such as system would be used as follows:
      
      ip link add dev br0 type bridge && ip link set dev br0 up
      for port in sw0p0 sw0p1 sw0p2 sw0p3 \
      	    sw1p0 sw1p1 sw1p2 sw1p3 \
      	    sw2p0 sw2p1 sw2p2 sw2p3; do
      	ip link set dev $port master br0
      done
      
      The above makes switching between ports on the same row be performed in
      hardware, and between ports on different rows in software. Now assume
      the Felix switch ports are called swp0, swp1, swp2. By running the
      following extra commands:
      
      ip link add dev br1 type bridge && ip link set dev br1 up
      for port in swp0 swp1 swp2; do
      	ip link set dev $port master br1
      done
      
      the CPU no longer sees packets which traverse sja1105 switch boundaries
      and can be forwarded directly by Felix. The br1 bridge would not be used
      for any sort of traffic termination.
      
      For this to work, we need to give drivers an opportunity to listen for
      bridging events on DSA trees other than their own, and pass that other
      tree index as argument. I have made the assumption, for the moment, that
      the other existing DSA notifiers don't need to be broadcast to other
      trees. That assumption might turn out to be incorrect. But in the
      meantime, introduce a dsa_broadcast function, similar in purpose to
      dsa_port_notify, which is used only by the bridging notifiers.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f66a6a69
    • V
      net: bridge: allow enslaving some DSA master network devices · 9eb8eff0
      Vladimir Oltean 提交于
      Commit 8db0a2ee ("net: bridge: reject DSA-enabled master netdevices
      as bridge members") added a special check in br_if.c in order to check
      for a DSA master network device with a tagging protocol configured. This
      was done because back then, such devices, once enslaved in a bridge
      would become inoperative and would not pass DSA tagged traffic anymore
      due to br_handle_frame returning RX_HANDLER_CONSUMED.
      
      But right now we have valid use cases which do require bridging of DSA
      masters. One such example is when the DSA master ports are DSA switch
      ports themselves (in a disjoint tree setup). This should be completely
      equivalent, functionally speaking, from having multiple DSA switches
      hanging off of the ports of a switchdev driver. So we should allow the
      enslaving of DSA tagged master network devices.
      
      Instead of the regular br_handle_frame(), install a new function
      br_handle_frame_dummy() on these DSA masters, which returns
      RX_HANDLER_PASS in order to call into the DSA specific tagging protocol
      handlers, and lift the restriction from br_add_if.
      Suggested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Suggested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      9eb8eff0
    • A
      net: phy: Send notifier when starting the cable test · 9896a457
      Andrew Lunn 提交于
      Given that it takes time to run a cable test, send a notify message at
      the start, as well as when it is completed.
      
      v3:
      EMSGSIZE when ethnl_bcastmsg_put() fails
      Print an error message on failure, since this is a void function.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      9896a457
    • A
      net: ethtool: Add helpers for reporting test results · 1e2dc145
      Andrew Lunn 提交于
      The PHY drivers can use these helpers for reporting the results. The
      results get translated into netlink attributes which are added to the
      pre-allocated skbuf.
      
      v3:
      Poison phydev->skb
      Return -EMSGSIZE when ethnl_bcastmsg_put() fails
      Return valid error code when nla_nest_start() fails
      Use u8 for results
      Actually put u32 length into message
      
      v4:
      s/ENOTSUPP/EOPNOTSUPP/g
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1e2dc145
    • A
      net: ethtool: Add infrastructure for reporting cable test results · 1dd3f212
      Andrew Lunn 提交于
      Provide infrastructure for PHY drivers to report the cable test
      results.  A netlink skb is associated to the phydev. Helpers will be
      added which can add results to this skb. Once the test has finished
      the results are sent to user space.
      
      When netlink ethtool is not part of the kernel configuration stubs are
      provided. It is also impossible to trigger a cable test, so the error
      code returned by the alloc function is of no consequence.
      
      v2:
      Include the status complete in the netlink notification message
      
      v4:
      Replace -EINVAL with -EMSGSIZE
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1dd3f212
    • A
      net: ethtool: Make helpers public · 0df960f1
      Andrew Lunn 提交于
      Make some helpers for building ethtool netlink messages available
      outside the compilation unit, so they can be used for building
      messages which are not simple get/set.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      0df960f1
    • A
      net: ethtool: netlink: Add support for triggering a cable test · 11ca3c42
      Andrew Lunn 提交于
      Add new ethtool netlink calls to trigger the starting of a PHY cable
      test.
      
      Add Kconfig'ury to ETHTOOL_NETLINK so that PHYLIB is not a module when
      ETHTOOL_NETLINK is builtin, which would result in kernel linking errors.
      
      v2:
      Remove unwanted white space change
      Remove ethnl_cable_test_act_ops and use doit handler
      Rename cable_test_set_policy cable_test_act_policy
      Remove ETHTOOL_MSG_CABLE_TEST_ACT_REPLY
      
      v3:
      Remove ETHTOOL_MSG_CABLE_TEST_ACT_REPLY from documentation
      Remove unused cable_test_get_policy
      Add Reviewed-by tags
      
      v4:
      Remove unwanted blank line
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      11ca3c42
  3. 09 5月, 2020 3 次提交
  4. 08 5月, 2020 12 次提交
  5. 07 5月, 2020 10 次提交
    • P
      net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE · 16f80360
      Pablo Neira Ayuso 提交于
      This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
      that the frontend does not need counters, this hw stats type request
      never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
      the driver to disable the stats, however, if the driver cannot disable
      counters, it bails out.
      
      TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
      except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
      (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
      TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
      
      Fixes: 319a1d19 ("flow_offload: check for basic action hw stats type")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16f80360
    • O
      ethtool: provide UAPI for PHY master/slave configuration. · bdbdac76
      Oleksij Rempel 提交于
      This UAPI is needed for BroadR-Reach 100BASE-T1 devices. Due to lack of
      auto-negotiation support, we needed to be able to configure the
      MASTER-SLAVE role of the port manually or from an application in user
      space.
      
      The same UAPI can be used for 1000BASE-T or MultiGBASE-T devices to
      force MASTER or SLAVE role. See IEEE 802.3-2018:
      22.2.4.3.7 MASTER-SLAVE control register (Register 9)
      22.2.4.3.8 MASTER-SLAVE status register (Register 10)
      40.5.2 MASTER-SLAVE configuration resolution
      45.2.1.185.1 MASTER-SLAVE config value (1.2100.14)
      45.2.7.10 MultiGBASE-T AN control 1 register (Register 7.32)
      
      The MASTER-SLAVE role affects the clock configuration:
      
      -------------------------------------------------------------------------------
      When the  PHY is configured as MASTER, the PMA Transmit function shall
      source TX_TCLK from a local clock source. When configured as SLAVE, the
      PMA Transmit function shall source TX_TCLK from the clock recovered from
      data stream provided by MASTER.
      
      iMX6Q                     KSZ9031                XXX
      ------\                /-----------\        /------------\
            |                |           |        |            |
       MAC  |<----RGMII----->| PHY Slave |<------>| PHY Master |
            |<--- 125 MHz ---+-<------/  |        | \          |
      ------/                \-----------/        \------------/
                                                     ^
                                                      \-TX_TCLK
      
      -------------------------------------------------------------------------------
      
      Since some clock or link related issues are only reproducible in a
      specific MASTER-SLAVE-role, MAC and PHY configuration, it is beneficial
      to provide generic (not 100BASE-T1 specific) interface to the user space
      for configuration flexibility and trouble shooting.
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdbdac76
    • F
      net: dsa: Do not leave DSA master with NULL netdev_ops · 050569fc
      Florian Fainelli 提交于
      When ndo_get_phys_port_name() for the CPU port was added we introduced
      an early check for when the DSA master network device in
      dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
      we perform the teardown operation in dsa_master_ndo_teardown() we would
      not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
      non-NULL initialized.
      
      With network device drivers such as virtio_net, this leads to a NPD as
      soon as the DSA switch hanging off of it gets torn down because we are
      now assigning the virtio_net device's netdev_ops a NULL pointer.
      
      Fixes: da7b9e9b ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
      Reported-by: NAllen Pais <allen.pais@oracle.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: NAllen Pais <allen.pais@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      050569fc
    • V
      net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred · 65722159
      Vladimir Oltean 提交于
      This was caused by a poor merge conflict resolution on my side. The
      "act = &cls->rule->action.entries[0];" assignment was already present in
      the code prior to the patch mentioned below.
      
      Fixes: e13c2075 ("net: dsa: refactor matchall mirred action to separate function")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65722159
    • E
      tcp: defer xmit timer reset in tcp_xmit_retransmit_queue() · 916e6d1a
      Eric Dumazet 提交于
      As hinted in prior change ("tcp: refine tcp_pacing_delay()
      for very low pacing rates"), it is probably best arming
      the xmit timer only when all the packets have been scheduled,
      rather than when the head of rtx queue has been re-sent.
      
      This does matter for flows having extremely low pacing rates,
      since their tp->tcp_wstamp_ns could be far in the future.
      
      Note that the regular xmit path has a stronger limit
      in tcp_small_queue_check(), meaning it is less likely to
      go beyond the pacing horizon.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      916e6d1a
    • E
      tcp: refine tcp_pacing_delay() for very low pacing rates · 8dc242ad
      Eric Dumazet 提交于
      With the addition of horizon feature to sch_fq, we noticed some
      suboptimal behavior of extremely low pacing rate TCP flows, especially
      when TCP is not aware of a drop happening in lower stacks.
      
      Back in commit 3f80e08f ("tcp: add tcp_reset_xmit_timer() helper"),
      tcp_pacing_delay() was added to estimate an extra delay to add to standard
      rto timers.
      
      This patch removes the skb argument from this helper and
      tcp_reset_xmit_timer() because it makes more sense to simply
      consider the time at which next packet is allowed to be sent,
      instead of the time of whatever packet has been sent.
      
      This avoids arming RTO timer too soon and removes
      spurious horizon drops.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dc242ad
    • A
      seg6: fix SRH processing to comply with RFC8754 · 0cb7498f
      Ahmed Abdelsalam 提交于
      The Segment Routing Header (SRH) which defines the SRv6 dataplane is defined
      in RFC8754.
      
      RFC8754 (section 4.1) defines the SR source node behavior which encapsulates
      packets into an outer IPv6 header and SRH. The SR source node encodes the
      full list of Segments that defines the packet path in the SRH. Then, the
      first segment from list of Segments is copied into the Destination address
      of the outer IPv6 header and the packet is sent to the first hop in its path
      towards the destination.
      
      If the Segment list has only one segment, the SR source node can omit the SRH
      as he only segment is added in the destination address.
      
      RFC8754 (section 4.1.1) defines the Reduced SRH, when a source does not
      require the entire SID list to be preserved in the SRH. A reduced SRH does
      not contain the first segment of the related SR Policy (the first segment is
      the one already in the DA of the IPv6 header), and the Last Entry field is
      set to n-2, where n is the number of elements in the SR Policy.
      
      RFC8754 (section 4.3.1.1) defines the SRH processing and the logic to
      validate the SRH (S09, S10, S11) which works for both reduced and
      non-reduced behaviors.
      
      This patch updates seg6_validate_srh() to validate the SRH as per RFC8754.
      Signed-off-by: NAhmed Abdelsalam <ahabdels@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cb7498f
    • F
      ipv6: Implement draft-ietf-6man-rfc4941bis · 969c5464
      Fernando Gont 提交于
      Implement the upcoming rev of RFC4941 (IPv6 temporary addresses):
      https://tools.ietf.org/html/draft-ietf-6man-rfc4941bis-09
      
      * Reduces the default Valid Lifetime to 2 days
        The number of extra addresses employed when Valid Lifetime was
        7 days exacerbated the stress caused on network
        elements/devices. Additionally, the motivation for temporary
        addresses is indeed privacy and reduced exposure. With a
        default Valid Lifetime of 7 days, an address that becomes
        revealed by active communication is reachable and exposed for
        one whole week. The only use case for a Valid Lifetime of 7
        days could be some application that is expecting to have long
        lived connections. But if you want to have a long lived
        connections, you shouldn't be using a temporary address in the
        first place. Additionally, in the era of mobile devices, general
        applications should nevertheless be prepared and robust to
        address changes (e.g. nodes swap wifi <-> 4G, etc.)
      
      * Employs different IIDs for different prefixes
        To avoid network activity correlation among addresses configured
        for different prefixes
      
      * Uses a simpler algorithm for IID generation
        No need to store "history" anywhere
      Signed-off-by: NFernando Gont <fgont@si6networks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      969c5464
    • M
      net: hsr: fix incorrect type usage for protocol variable · f5dda315
      Murali Karicheri 提交于
      Fix following sparse checker warning:-
      
      net/hsr/hsr_slave.c:38:18: warning: incorrect type in assignment (different base types)
      net/hsr/hsr_slave.c:38:18:    expected unsigned short [unsigned] [usertype] protocol
      net/hsr/hsr_slave.c:38:18:    got restricted __be16 [usertype] h_proto
      net/hsr/hsr_slave.c:39:25: warning: restricted __be16 degrades to integer
      net/hsr/hsr_slave.c:39:57: warning: restricted __be16 degrades to integer
      Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com>
      Acked-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5dda315
    • J
      net: bridge: return false in br_mrp_enabled() · 8741e184
      Jason Yan 提交于
      Fix the following coccicheck warning:
      
      net/bridge/br_private.h:1334:8-9: WARNING: return of 0/1 in function
      'br_mrp_enabled' with return type bool
      
      Fixes: 65369933 ("bridge: mrp: Integrate MRP into the bridge")
      Signed-off-by: NJason Yan <yanaijie@huawei.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8741e184
  6. 06 5月, 2020 1 次提交