1. 18 8月, 2019 12 次提交
    • P
      selftests/bpf: fix race in test_tcp_rtt test · fae55527
      Petar Penkov 提交于
      There is a race in this test between receiving the ACK for the
      single-byte packet sent in the test, and reading the values from the
      map.
      
      This patch fixes this by having the client wait until there are no more
      unacknowledged packets.
      
      Before:
      for i in {1..1000}; do ../net/in_netns.sh ./test_tcp_rtt; \
      done | grep -c PASSED
      < trimmed error messages >
      993
      
      After:
      for i in {1..10000}; do ../net/in_netns.sh ./test_tcp_rtt; \
      done | grep -c PASSED
      10000
      
      Fixes: b5587398 ("selftests/bpf: test BPF_SOCK_OPS_RTT_CB")
      Signed-off-by: NPetar Penkov <ppenkov@google.com>
      Reviewed-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      fae55527
    • A
      libbpf: relicense bpf_helpers.h and bpf_endian.h · 929ffa6e
      Andrii Nakryiko 提交于
      bpf_helpers.h and bpf_endian.h contain useful macros and BPF helper
      definitions essential to almost every BPF program. Which makes them
      useful not just for selftests. To be able to expose them as part of
      libbpf, though, we need them to be dual-licensed as LGPL-2.1 OR
      BSD-2-Clause. This patch updates licensing of those two files.
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NHechao Li <hechaol@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAndrey Ignatov <rdna@fb.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Acked-by: NLawrence Brakmo <brakmo@fb.com>
      Acked-by: NAdam Barth <arb@fb.com>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Acked-by: NJosef Bacik <jbacik@fb.com>
      Acked-by: NJoe Stringer <joe@wand.net.nz>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Acked-by: NLorenz Bauer <lmb@cloudflare.com>
      Acked-by: NAdrian Ratiu <adrian.ratiu@collabora.com>
      Acked-by: NNikita V. Shirokov <tehnerd@tehnerd.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NPetar Penkov <ppenkov@google.com>
      Acked-by: NTeng Qin <palmtenor@gmail.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Michal Rostecki <mrostecki@opensuse.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      929ffa6e
    • M
      net: Don't call XDP_SETUP_PROG when nothing is changed · c14a9f63
      Maxim Mikityanskiy 提交于
      Don't uninstall an XDP program when none is installed, and don't install
      an XDP program that has the same ID as the one already installed.
      
      dev_change_xdp_fd doesn't perform any checks in case it uninstalls an
      XDP program. It means that the driver's ndo_bpf can be called with
      XDP_SETUP_PROG asking to set it to NULL even if it's already NULL. This
      case happens if the user runs `ip link set eth0 xdp off` when there is
      no XDP program attached.
      
      The symmetrical case is possible when the user tries to set the program
      that is already set.
      
      The drivers typically perform some heavy operations on XDP_SETUP_PROG,
      so they all have to handle these cases internally to return early if
      they happen. This patch puts this check into the kernel code, so that
      all drivers will benefit from it.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c14a9f63
    • D
      Merge branch 'bpf-af-xdp-wakeup' · c8186c80
      Daniel Borkmann 提交于
      Magnus Karlsson says:
      
      ====================
      This patch set adds support for a new flag called need_wakeup in the
      AF_XDP Tx and fill rings. When this flag is set by the driver, it
      means that the application has to explicitly wake up the kernel Rx
      (for the bit in the fill ring) or kernel Tx (for bit in the Tx ring)
      processing by issuing a syscall. Poll() can wake up both and sendto()
      will wake up Tx processing only.
      
      The main reason for introducing this new flag is to be able to
      efficiently support the case when application and driver is executing
      on the same core. Previously, the driver was just busy-spinning on the
      fill ring if it ran out of buffers in the HW and there were none to
      get from the fill ring. This approach works when the application and
      driver is running on different cores as the application can replenish
      the fill ring while the driver is busy-spinning. Though, this is a
      lousy approach if both of them are running on the same core as the
      probability of the fill ring getting more entries when the driver is
      busy-spinning is zero. With this new feature the driver now sets the
      need_wakeup flag and returns to the application. The application can
      then replenish the fill queue and then explicitly wake up the Rx
      processing in the kernel using the syscall poll(). For Tx, the flag is
      only set to one if the driver has no outstanding Tx completion
      interrupts. If it has some, the flag is zero as it will be woken up by
      a completion interrupt anyway. This flag can also be used in other
      situations where the driver needs to be woken up explicitly.
      
      As a nice side effect, this new flag also improves the Tx performance
      of the case where application and driver are running on two different
      cores as it reduces the number of syscalls to the kernel. The kernel
      tells user space if it needs to be woken up by a syscall, and this
      eliminates many of the syscalls. The Rx performance of the 2-core case
      is on the other hand slightly worse, since there is a need to use a
      syscall now to wake up the driver, instead of the driver
      busy-spinning. It does waste less CPU cycles though, which might lead
      to better overall system performance.
      
      This new flag needs some simple driver support. If the driver does not
      support it, the Rx flag is always zero and the Tx flag is always
      one. This makes any application relying on this feature default to the
      old behavior of not requiring any syscalls in the Rx path and always
      having to call sendto() in the Tx path.
      
      For backwards compatibility reasons, this feature has to be explicitly
      turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
      that you always turn it on as it has a large positive performance
      impact for the one core case and does not degrade 2 core performance
      and actually improves it for Tx heavy workloads.
      
      Here are some performance numbers measured on my local,
      non-performance optimized development system. That is why you are
      seeing numbers lower than the ones from Björn and Jesper. 64 byte
      packets at 40Gbit/s line rate. All results in Mpps. Cores == 1 means
      that both application and driver is executing on the same core. Cores
      == 2 that they are on different cores.
      
                                    Applications
      need_wakeup  cores    txpush    rxdrop      l2fwd
      ---------------------------------------------------------------
           n         1       0.07      0.06        0.03
           y         1       21.6      8.2         6.5
           n         2       32.3      11.7        8.7
           y         2       33.1      11.7        8.7
      
      Overall, the need_wakeup flag provides the same or better performance
      in all the micro-benchmarks. The reduction of sendto() calls in txpush
      is large. Only a few per second is needed. For l2fwd, the drop is 50%
      for the 1 core case and more than 99.9% for the 2 core case. Do not
      know why I am not seeing the same drop for the 1 core case yet.
      
      The name and inspiration of the flag has been taken from io_uring by
      Jens Axboe. Details about this feature in io_uring can be found in
      http://kernel.dk/io_uring.pdf, section 8.3. It also addresses most of
      the denial of service and sendto() concerns raised by Maxim
      Mikityanskiy in https://www.spinics.net/lists/netdev/msg554657.html.
      
      The typical Tx part of an application will have to change from:
      
      ret = sendto(fd,....)
      
      to:
      
      if (xsk_ring_prod__needs_wakeup(&xsk->tx))
             ret = sendto(fd,....)
      
      and th Rx part from:
      
      rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
      if (!rcvd)
             return;
      
      to:
      
      rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
      if (!rcvd) {
             if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
                    ret = poll(fd,.....);
             return;
      }
      
      v3 -> v4:
      * Maxim found a possible race in the Tx part of the driver. The
        setting of the flag needs to happen before the sending, otherwise it
        might trigger this race. Fixed in ixgbe and i40e driver.
      * Mellanox support contributed by Maxim
      * Removed the XSK_DRV_CAN_SLEEP flag as it was not used
        anymore. Thanks to Sridhar for discovering this.
      * For consistency the feature is now always called need_wakeup. There
        were some places where it was referred to as might_sleep, but they
        have been removed. Thanks to Sridhar for spotting.
      * Fixed some typos in the commit messages
      
      v2 -> v3:
      * Converted the Mellanox driver to the new ndo in patch 1 as pointed
        out by Maxim
      * Fixed the compatibility code of XDP_MMAP_OFFSETS so it now works.
      
      v1 -> v2:
      * Fixed bisectability problem pointed out by Jakub
      * Added missing initiliztion of the Tx need_wakeup flag to 1
      
      This patch has been applied against commit b753c5a7 ("Merge branch 'r8152-RX-improve'")
      
      Structure of the patch set:
      
      Patch 1: Replaces the ndo_xsk_async_xmit with ndo_xsk_wakeup to
               support waking up both Rx and Tx processing
      Patch 2: Implements the need_wakeup functionality in common code
      Patch 3-4: Add need_wakeup support to the i40e and ixgbe drivers
      Patch 5: Add need_wakeup support to libbpf
      Patch 6: Add need_wakeup support to the xdpsock sample application
      Patch 7-8: Add need_wakeup support to the Mellanox mlx5 driver
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c8186c80
    • M
      net/mlx5e: Add AF_XDP need_wakeup support · a7bd4018
      Maxim Mikityanskiy 提交于
      This commit adds support for the new need_wakeup feature of AF_XDP. The
      applications can opt-in by using the XDP_USE_NEED_WAKEUP bind() flag.
      When this feature is enabled, some behavior changes:
      
      RX side: If the Fill Ring is empty, instead of busy-polling, set the
      flag to tell the application to kick the driver when it refills the Fill
      Ring.
      
      TX side: If there are pending completions or packets queued for
      transmission, set the flag to tell the application that it can skip the
      sendto() syscall and save time.
      
      The performance testing was performed on a machine with the following
      configuration:
      
      - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
      - Mellanox ConnectX-5 Ex with 100 Gbit/s link
      
      The results with retpoline disabled:
      
             | without need_wakeup  | with need_wakeup     |
             |----------------------|----------------------|
             | one core | two cores | one core | two cores |
      -------|----------|-----------|----------|-----------|
      txonly | 20.1     | 33.5      | 29.0     | 34.2      |
      rxdrop | 0.065    | 14.1      | 12.0     | 14.1      |
      l2fwd  | 0.032    | 7.3       | 6.6      | 7.2       |
      
      "One core" means the application and NAPI run on the same core. "Two
      cores" means they are pinned to different cores.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a7bd4018
    • M
      net/mlx5e: Move the SW XSK code from NAPI poll to a separate function · 871aa189
      Maxim Mikityanskiy 提交于
      Two XSK tasks are performed during NAPI polling, that are not bound to
      hardware interrupts: TXing packets and polling for frames in the Fill
      Ring. They are special in a way that the hardware doesn't know about
      these tasks, so it doesn't trigger interrupts if there is still some
      work to be done, it's our driver's responsibility to ensure NAPI will be
      rescheduled if needed.
      
      Create a new function to handle these tasks and move the corresponding
      code from mlx5e_napi_poll to the new function to improve modularity and
      prepare for the changes in the following patch.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      871aa189
    • M
      samples/bpf: add use of need_wakeup flag in xdpsock · 46738f73
      Magnus Karlsson 提交于
      This commit adds using the need_wakeup flag to the xdpsock sample
      application. It is turned on by default as we think it is a feature
      that seems to always produce a performance benefit, if the application
      has been written taking advantage of it. It can be turned off in the
      sample app by using the '-m' command line option.
      
      The txpush and l2fwd sub applications have also been updated to
      support poll() with multiple sockets.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      46738f73
    • M
      libbpf: add support for need_wakeup flag in AF_XDP part · a4500432
      Magnus Karlsson 提交于
      This commit adds support for the new need_wakeup flag in AF_XDP. The
      xsk_socket__create function is updated to handle this and a new
      function is introduced called xsk_ring_prod__needs_wakeup(). This
      function can be used by the application to check if Rx and/or Tx
      processing needs to be explicitly woken up.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a4500432
    • M
      ixgbe: add support for AF_XDP need_wakeup feature · 5c129241
      Magnus Karlsson 提交于
      This patch adds support for the need_wakeup feature of AF_XDP. If the
      application has told the kernel that it might sleep using the new bind
      flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
      no more buffers on the NIC Rx ring and yield to the application. For
      Tx, it will set the flag if it has no outstanding Tx completion
      interrupts and return to the application.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      5c129241
    • M
      i40e: add support for AF_XDP need_wakeup feature · 3d0c5f1c
      Magnus Karlsson 提交于
      This patch adds support for the need_wakeup feature of AF_XDP. If the
      application has told the kernel that it might sleep using the new bind
      flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
      no more buffers on the NIC Rx ring and yield to the application. For
      Tx, it will set the flag if it has no outstanding Tx completion
      interrupts and return to the application.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      3d0c5f1c
    • M
      xsk: add support for need_wakeup flag in AF_XDP rings · 77cd0d7b
      Magnus Karlsson 提交于
      This commit adds support for a new flag called need_wakeup in the
      AF_XDP Tx and fill rings. When this flag is set, it means that the
      application has to explicitly wake up the kernel Rx (for the bit in
      the fill ring) or kernel Tx (for bit in the Tx ring) processing by
      issuing a syscall. Poll() can wake up both depending on the flags
      submitted and sendto() will wake up tx processing only.
      
      The main reason for introducing this new flag is to be able to
      efficiently support the case when application and driver is executing
      on the same core. Previously, the driver was just busy-spinning on the
      fill ring if it ran out of buffers in the HW and there were none on
      the fill ring. This approach works when the application is running on
      another core as it can replenish the fill ring while the driver is
      busy-spinning. Though, this is a lousy approach if both of them are
      running on the same core as the probability of the fill ring getting
      more entries when the driver is busy-spinning is zero. With this new
      feature the driver now sets the need_wakeup flag and returns to the
      application. The application can then replenish the fill queue and
      then explicitly wake up the Rx processing in the kernel using the
      syscall poll(). For Tx, the flag is only set to one if the driver has
      no outstanding Tx completion interrupts. If it has some, the flag is
      zero as it will be woken up by a completion interrupt anyway.
      
      As a nice side effect, this new flag also improves the performance of
      the case where application and driver are running on two different
      cores as it reduces the number of syscalls to the kernel. The kernel
      tells user space if it needs to be woken up by a syscall, and this
      eliminates many of the syscalls.
      
      This flag needs some simple driver support. If the driver does not
      support this, the Rx flag is always zero and the Tx flag is always
      one. This makes any application relying on this feature default to the
      old behaviour of not requiring any syscalls in the Rx path and always
      having to call sendto() in the Tx path.
      
      For backwards compatibility reasons, this feature has to be explicitly
      turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
      that you always turn it on as it so far always have had a positive
      performance impact.
      
      The name and inspiration of the flag has been taken from io_uring by
      Jens Axboe. Details about this feature in io_uring can be found in
      http://kernel.dk/io_uring.pdf, section 8.3.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      77cd0d7b
    • M
      xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup · 9116e5e2
      Magnus Karlsson 提交于
      This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
      ndo provides the same functionality as before but with the addition of
      a new flags field that is used to specifiy if Rx, Tx or both should be
      woken up. The previous ndo only woke up Tx, as implied by the
      name. The i40e and ixgbe drivers (which are all the supported ones)
      are updated with this new interface.
      
      This new ndo will be used by the new need_wakeup functionality of XDP
      sockets that need to be able to wake up both Rx and Tx driver
      processing.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9116e5e2
  2. 16 8月, 2019 14 次提交
  3. 15 8月, 2019 1 次提交
  4. 14 8月, 2019 13 次提交
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · c162610c
      Jakub Kicinski 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for net-next:
      
      1) Rename mss field to mss_option field in synproxy, from Fernando Mancera.
      
      2) Use SYSCTL_{ZERO,ONE} definitions in conntrack, from Matteo Croce.
      
      3) More strict validation of IPVS sysctl values, from Junwei Hu.
      
      4) Remove unnecessary spaces after on the right hand side of assignments,
         from yangxingwu.
      
      5) Add offload support for bitwise operation.
      
      6) Extend the nft_offload_reg structure to store immediate date.
      
      7) Collapse several ip_set header files into ip_set.h, from
         Jeremy Sowden.
      
      8) Make netfilter headers compile with CONFIG_KERNEL_HEADER_TEST=y,
         from Jeremy Sowden.
      
      9) Fix several sparse warnings due to missing prototypes, from
         Valdis Kletnieks.
      
      10) Use static lock initialiser to ensure connlabel spinlock is
          initialized on boot time to fix sched/act_ct.c, patch
          from Florian Westphal.
      ====================
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      c162610c
    • J
      Merge branch 'r8152-RX-improve' · b753c5a7
      Jakub Kicinski 提交于
      Hayes says:
      
      ====================
      v2:
      For patch #2, replace list_for_each_safe with list_for_each_entry_safe.
      Remove unlikely in WARN_ON. Adjust the coding style.
      
      For patch #4, replace list_for_each_safe with list_for_each_entry_safe.
      Remove "else" after "continue".
      
      For patch #5. replace sysfs with ethtool to modify rx_copybreak and
      rx_pending.
      
      v1:
      The different chips use different rx buffer size.
      
      Use skb_add_rx_frag() to reduce memory copy for RX.
      ====================
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      b753c5a7
    • H
      r8152: change rx_copybreak and rx_pending through ethtool · e4a5017a
      Hayes Wang 提交于
      Let the rx_copybreak and rx_pending could be modified by
      ethtool.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      e4a5017a
    • H
      r8152: support skb_add_rx_frag · 47922fcd
      Hayes Wang 提交于
      Use skb_add_rx_frag() to reduce the memory copy for rx data.
      
      Use a new list of rx_used to store the rx buffer which couldn't be
      reused yet.
      
      Besides, the total number of rx buffer may be increased or decreased
      dynamically. And it is limited by RTL8152_MAX_RX_AGG.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      47922fcd
    • H
      r8152: use alloc_pages for rx buffer · d55d7089
      Hayes Wang 提交于
      Replace kmalloc_node() with alloc_pages() for rx buffer.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      d55d7089
    • H
      r8152: replace array with linking list for rx information · 252df8b8
      Hayes Wang 提交于
      The original method uses an array to store the rx information. The
      new one uses a list to link each rx structure. Then, it is possible
      to increase/decrease the number of rx structure dynamically.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      252df8b8
    • H
      r8152: separate the rx buffer size · ec5791c2
      Hayes Wang 提交于
      The different chips may accept different rx buffer sizes. The RTL8152
      supports 16K bytes, and RTL8153 support 32K bytes.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      ec5791c2
    • J
      Merge branch 'net-phy-let-phy_speed_down-up-support-speeds-1Gbps' · e070ca37
      Jakub Kicinski 提交于
      Heiner says:
      
      ====================
      So far phy_speed_down/up can be used up to 1Gbps only. Remove this
      restriction and add needed helpers to phy-core.c
      
      v2:
      - remove unused parameter in patch 1
      - rename __phy_speed_down to phy_speed_down_core in patch 2
      ====================
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      e070ca37
    • H
      net: phy: let phy_speed_down/up support speeds >1Gbps · 65b27995
      Heiner Kallweit 提交于
      So far phy_speed_down/up can be used up to 1Gbps only. Remove this
      restriction by using new helper __phy_speed_down. New member adv_old
      in struct phy_device is used by phy_speed_up to restore the advertised
      modes before calling phy_speed_down. Don't simply advertise what is
      supported because a user may have intentionally removed modes from
      advertisement.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      65b27995
    • H
      net: phy: add phy_speed_down_core and phy_resolve_min_speed · 331c56ac
      Heiner Kallweit 提交于
      phy_speed_down_core provides most of the functionality for
      phy_speed_down. It makes use of new helper phy_resolve_min_speed that is
      based on the sorting of the settings[] array. In certain cases it may be
      helpful to be able to exclude legacy half duplex modes, therefore
      prepare phy_resolve_min_speed() for it.
      
      v2:
      - rename __phy_speed_down to phy_speed_down_core
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      331c56ac
    • H
      net: phy: add __set_linkmode_max_speed · 7b261e0e
      Heiner Kallweit 提交于
      We will need the functionality of __set_linkmode_max_speed also for
      linkmode bitmaps other than phydev->supported. Therefore split it.
      
      v2:
      - remove unused parameter from __set_linkmode_max_speed
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      7b261e0e
    • V
      net: devlink: remove redundant rtnl lock assert · 043b8413
      Vlad Buslov 提交于
      It is enough for caller of devlink_compat_switch_id_get() to hold the net
      device to guarantee that devlink port is not destroyed concurrently. Remove
      rtnl lock assertion and modify comment to warn user that they must hold
      either rtnl lock or reference to net device. This is necessary to
      accommodate future implementation of rtnl-unlocked TC offloads driver
      callbacks.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      043b8413
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 708852dc
      Jakub Kicinski 提交于
      Daniel Borkmann says:
      
      ====================
      The following pull-request contains BPF updates for your *net-next* tree.
      
      There is a small merge conflict in libbpf (Cc Andrii so he's in the loop
      as well):
      
              for (i = 1; i <= btf__get_nr_types(btf); i++) {
                      t = (struct btf_type *)btf__type_by_id(btf, i);
      
                      if (!has_datasec && btf_is_var(t)) {
                              /* replace VAR with INT */
                              t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
        <<<<<<< HEAD
                              /*
                               * using size = 1 is the safest choice, 4 will be too
                               * big and cause kernel BTF validation failure if
                               * original variable took less than 4 bytes
                               */
                              t->size = 1;
                              *(int *)(t+1) = BTF_INT_ENC(0, 0, 8);
                      } else if (!has_datasec && kind == BTF_KIND_DATASEC) {
        =======
                              t->size = sizeof(int);
                              *(int *)(t + 1) = BTF_INT_ENC(0, 0, 32);
                      } else if (!has_datasec && btf_is_datasec(t)) {
        >>>>>>> 72ef80b5
                              /* replace DATASEC with STRUCT */
      
      Conflict is between the two commits 1d4126c4 ("libbpf: sanitize VAR to
      conservative 1-byte INT") and b03bc685 ("libbpf: convert libbpf code to
      use new btf helpers"), so we need to pick the sanitation fixup as well as
      use the new btf_is_datasec() helper and the whitespace cleanup. Looks like
      the following:
      
        [...]
                      if (!has_datasec && btf_is_var(t)) {
                              /* replace VAR with INT */
                              t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0);
                              /*
                               * using size = 1 is the safest choice, 4 will be too
                               * big and cause kernel BTF validation failure if
                               * original variable took less than 4 bytes
                               */
                              t->size = 1;
                              *(int *)(t + 1) = BTF_INT_ENC(0, 0, 8);
                      } else if (!has_datasec && btf_is_datasec(t)) {
                              /* replace DATASEC with STRUCT */
        [...]
      
      The main changes are:
      
      1) Addition of core parts of compile once - run everywhere (co-re) effort,
         that is, relocation of fields offsets in libbpf as well as exposure of
         kernel's own BTF via sysfs and loading through libbpf, from Andrii.
      
         More info on co-re: http://vger.kernel.org/bpfconf2019.html#session-2
         and http://vger.kernel.org/lpc-bpf2018.html#session-2
      
      2) Enable passing input flags to the BPF flow dissector to customize parsing
         and allowing it to stop early similar to the C based one, from Stanislav.
      
      3) Add a BPF helper function that allows generating SYN cookies from XDP and
         tc BPF, from Petar.
      
      4) Add devmap hash-based map type for more flexibility in device lookup for
         redirects, from Toke.
      
      5) Improvements to XDP forwarding sample code now utilizing recently enabled
         devmap lookups, from Jesper.
      
      6) Add support for reporting the effective cgroup progs in bpftool, from Jakub
         and Takshak.
      
      7) Fix reading kernel config from bpftool via /proc/config.gz, from Peter.
      
      8) Fix AF_XDP umem pages mapping for 32 bit architectures, from Ivan.
      
      9) Follow-up to add two more BPF loop tests for the selftest suite, from Alexei.
      
      10) Add perf event output helper also for other skb-based program types, from Allan.
      
      11) Fix a co-re related compilation error in selftests, from Yonghong.
      ====================
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      708852dc