1. 29 5月, 2020 10 次提交
    • C
      net: add sock_enable_timestamps · 783da70e
      Christoph Hellwig 提交于
      Add a helper to directly enable timestamps instead of setting the
      SO_TIMESTAMP* sockopts from kernel space and going through a fake
      uaccess.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      783da70e
    • C
      net: add sock_bindtoindex · 7594888c
      Christoph Hellwig 提交于
      Add a helper to directly set the SO_BINDTOIFINDEX sockopt from kernel
      space without going through a fake uaccess.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7594888c
    • C
      net: add sock_set_sndtimeo · 76ee0785
      Christoph Hellwig 提交于
      Add a helper to directly set the SO_SNDTIMEO_NEW sockopt from kernel
      space without going through a fake uaccess.  The interface is
      simplified to only pass the seconds value, as that is the only
      thing needed at the moment.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76ee0785
    • C
      net: add sock_set_priority · 6e434967
      Christoph Hellwig 提交于
      Add a helper to directly set the SO_PRIORITY sockopt from kernel space
      without going through a fake uaccess.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e434967
    • C
      net: add sock_no_linger · c433594c
      Christoph Hellwig 提交于
      Add a helper to directly set the SO_LINGER sockopt from kernel space
      with onoff set to true and a linger time of 0 without going through a
      fake uaccess.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c433594c
    • C
      net: add sock_set_reuseaddr · b58f0e8f
      Christoph Hellwig 提交于
      Add a helper to directly set the SO_REUSEADDR sockopt from kernel space
      without going through a fake uaccess.
      
      For this the iscsi target now has to formally depend on inet to avoid
      a mostly theoretical compile failure.  For actual operation it already
      did depend on having ipv4 or ipv6 support.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b58f0e8f
    • D
      Merge tag 'mlx5-updates-2020-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 1eba1110
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2020-05-26
      
      Updates highlights:
      
      1) From Vu Pham (8): Support VM traffics failover with bonded VF
      representors and e-switch egress/ingress ACLs
      
      This series introduce the support for Virtual Machine running I/O
      traffic over direct/fast VF path and failing over to slower
      paravirtualized path using the following features:
      
           __________________________________
          |  VM      _________________        |
          |          |FAILOVER device |       |
          |          |________________|       |
          |                  |                |
          |              ____|_____           |
          |              |         |          |
          |       ______ |___  ____|_______   |
          |       |  VF PT  |  |VIRTIO-NET |  |
          |       | device  |  | device    |  |
          |       |_________|  |___________|  |
          |___________|______________|________|
                      |              |
                      | HYPERVISOR   |
                      |          ____|______
                      |         |  macvtap  |
                      |         |virtio BE  |
                      |         |___________|
                      |               |
                      |           ____|_____
                      |           |host VF  |
                      |           |_________|
                      |               |
                 _____|______    _____|_____
                 |  PT VF    |  |  host VF  |
                 |representor|  |representor|
                 |___________|  |___________|
                      \               /
                       \             /
                        \           /
                         \         /                     _________________
                          \_______/                     |                |
                       _______|________                 |    V-SWITCH    |
                      |VF representors |________________|      (OVS)     |
                      |      bond      |                |________________|
                      |________________|                        |
                                                        ________|________
                                                       |    Uplink       |
                                                       |  representor    |
                                                       |_________________|
      
      Summary:
      --------
      Problem statement:
      ------------------
      Currently in above topology, when netfailover device is configured using
      VFs and eswitch VF representors, and when traffic fails over to stand-by
      VF which is exposed using macvtap device to guest VM, eswitch fails to
      switch the traffic to the stand-by VF representor. This occurs because
      there is no knowledge at eswitch level of the stand-by representor
      device.
      
      Solution:
      ---------
      Using standard bonding driver, a bond netdevice is created over VF
      representor device which is used for offloading tc rules.
      Two VF representors are bonded together, one for the passthrough VF
      device and another one for the stand-by VF device.
      With this solution, mlx5 driver listens to the failover events
      occuring at the bond device level to failover traffic to either of
      the active VF representor of the bond.
      
      a. VM with netfailover device of VF pass-thru (PT) device and virtio-net
         paravirtualized device with same MAC-address to handle failover
         traffics at VM level.
      
      b. Host bond is active-standby mode, with the lower devices being the VM
         VF PT representor, and the representor of the 2nd VF to handle
         failover traffics at Hypervisor/V-Switch OVS level.
         - During the steady state (fast datapath): set the bond active
           device to be the VM PT VF representor.
         - During failover: apply bond failover to the second VF representor
           device which connects to the VM non-accelerated path.
      
      c. E-Switch ingress/egress ACL tables to support failover traffics at
         E-Switch level
         I. E-Switch egress ACL with forward-to-vport rule:
           - By default, eswitch vport egress acl forward packets to its
             counterpart NIC vport.
           - During port failover, the egress acl forward-to-vport rule will
             be added to e-switch vport of passive/in-active slave VF
      representor
             to forward packets to other e-switch vport ie. the active slave
             representor's e-switch vport to handle egress "failover"
      traffics.
           - Using lower change netdev event to detect a representor is a
             lower
             dev (slave) of bond and becomes active, adding egress acl
             forward-to-vport rule of all other slave netdevs to forward to
      this
             representor's vport.
           - Using upper change netdev event to detect a representor unslaving
             from bond device to delete its vport's egress acl forward-to-vport
             rule.
      
         II. E-Switch ingress ACL metadata reg_c for match
           - Bonded representors' vorts sharing tc block have the same
             root ingress acl table and a unique metadata for match.
           - Traffics from both representors's vports will be tagged with same
             unique metadata reg_c.
           - Using upper change netdev event to detect a representor
             enslaving/unslaving from bond device to setup shared root ingress
             acl and unique metadata.
      
      2) From Alex Vesker (2): Slpit RX and TX lock for parallel rule insertion in
      software steering
      
      3) Eli Britstein (2): Optimize performance for IPv4/IPv6 ethertype use the HW
      ip_version register rather than parsing eth frames for ethertype.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1eba1110
    • E
      tcp: ipv6: support RFC 6069 (TCP-LD) · d2924569
      Eric Dumazet 提交于
      Make tcp_ld_RTO_revert() helper available to IPv6, and
      implement RFC 6069 :
      
      Quoting this RFC :
      
      3. Connectivity Disruption Indication
      
         For Internet Protocol version 6 (IPv6) [RFC2460], the counterpart of
         the ICMP destination unreachable message of code 0 (net unreachable)
         and of code 1 (host unreachable) is the ICMPv6 destination
         unreachable message of code 0 (no route to destination) [RFC4443].
         As with IPv4, a router should generate an ICMPv6 destination
         unreachable message of code 0 in response to a packet that cannot be
         delivered to its destination address because it lacks a matching
         entry in its routing table.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2924569
    • V
      net: dsa: sja1105: offload the Credit-Based Shaper qdisc · 4d752508
      Vladimir Oltean 提交于
      SJA1105, being AVB/TSN switches, provide hardware assist for the
      Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
      
      First generation has 10 shapers, freely assignable to any of the 4
      external ports and 8 traffic classes, and second generation has 16
      shapers.
      
      The Credit-Based Shaper tables are accessed through the dynamic
      reconfiguration interface, so we have to restore them manually after a
      switch reset. The tables are backed up by the static config only on
      P/Q/R/S, and we don't want to add custom code only for that family,
      since the procedure that is in place now works for both.
      
      Tested with the following commands:
      
      data_rate_kbps=67000
      port_transmit_rate_kbps=1000000
      idleslope=$data_rate_kbps
      sendslope=$(($idleslope - $port_transmit_rate_kbps))
      locredit=$((-0x80000000))
      hicredit=$((0x7fffffff))
      tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \
              map 0 1 2 3 4 5 6 7 \
              queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7
      tc qdisc replace dev swp2 parent 1:1 cbs \
              idleslope $idleslope \
              sendslope $sendslope \
              hicredit $hicredit \
              locredit $locredit \
              offload 1
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d752508
    • D
      selftests: Add torture tests to nexthop tests · 7c741868
      David Ahern 提交于
      Add Nik's torture tests as a new set to stress the replace and cleanup
      paths.
      
      Torture test created by Nikolay Aleksandrov and then I adapted to
      selftest and added IPv6 version.
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c741868
  2. 28 5月, 2020 30 次提交