1. 24 11月, 2022 1 次提交
  2. 05 11月, 2022 1 次提交
  3. 31 10月, 2022 1 次提交
  4. 28 10月, 2022 1 次提交
    • V
      net: enetc: survive memory pressure without crashing · 84ce1ca3
      Vladimir Oltean 提交于
      Under memory pressure, enetc_refill_rx_ring() may fail, and when called
      during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
      checked for.
      
      An extreme case of memory pressure will result in exactly zero buffers
      being allocated for the RX ring, and in such a case it is expected that
      hardware drops all RX packets due to lack of buffers.
      
      This does not happen, because the reset-default value of the consumer
      and produces index is 0, and this makes the ENETC think that all buffers
      have been initialized and that it owns them (when in reality none were).
      
      The hardware guide explains this best:
      
      | Configure the receive ring producer index register RBaPIR with a value
      | of 0. The producer index is initially configured by software but owned
      | by hardware after the ring has been enabled. Hardware increments the
      | index when a frame is received which may consume one or more BDs.
      | Hardware is not allowed to increment the producer index to match the
      | consumer index since it is used to indicate an empty condition. The ring
      | can hold at most RBLENR[LENGTH]-1 received BDs.
      |
      | Configure the receive ring consumer index register RBaCIR. The
      | consumer index is owned by software and updated during operation of the
      | of the BD ring by software, to indicate that any receive data occupied
      | in the BD has been processed and it has been prepared for new data.
      | - If consumer index and producer index are initialized to the same
      |   value, it indicates that all BDs in the ring have been prepared and
      |   hardware owns all of the entries.
      | - If consumer index is initialized to producer index plus N, it would
      |   indicate N BDs have been prepared. Note that hardware cannot start if
      |   only a single buffer is prepared due to the restrictions described in
      |   (2).
      | - Software may write consumer index to match producer index anytime
      |   while the ring is operational to indicate all received BDs prior have
      |   been processed and new BDs prepared for hardware.
      
      Normally, the value of rx_ring->rcir (consumer index) is brought in sync
      with the rx_ring->next_to_use software index, but this only happens if
      page allocation ever succeeded.
      
      When PI==CI==0, the hardware appears to receive frames and write them to
      DMA address 0x0 (?!), then set the READY bit in the BD.
      
      The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
      not prepared to handle such a condition. It will attempt to process
      those frames using the rx_swbd structure associated with index i of the
      RX ring, but that structure is not fully initialized (enetc_new_page()
      does all of that). So what happens next is undefined behavior.
      
      To operate using no buffer, we must initialize the CI to PI + 1, which
      will block the hardware from advancing the CI any further, and drop
      everything.
      
      The issue was seen while adding support for zero-copy AF_XDP sockets,
      where buffer memory comes from user space, which can even decide to
      supply no buffers at all (example: "xdpsock --txonly"). However, the bug
      is present also with the network stack code, even though it would take a
      very determined person to trigger a page allocation failure at the
      perfect time (a series of ifup/ifdown under memory pressure should
      eventually reproduce it given enough retries).
      
      Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
      Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      84ce1ca3
  5. 27 10月, 2022 1 次提交
  6. 24 10月, 2022 1 次提交
  7. 07 10月, 2022 1 次提交
  8. 03 10月, 2022 1 次提交
    • S
      net: fec: using page pool to manage RX buffers · 95698ff6
      Shenwei Wang 提交于
      This patch optimizes the RX buffer management by using the page
      pool. The purpose for this change is to prepare for the following
      XDP support. The current driver uses one frame per page for easy
      management.
      
      Added __maybe_unused attribute to the following functions to avoid
      the compiling warning. Those functions will be removed by a separate
      patch once this page pool solution is accepted.
       - fec_enet_new_rxbdp
       - fec_enet_copybreak
      
      The following are the comparing result between page pool implementation
      and the original implementation (non page pool).
      
       --- small packet (64 bytes) testing are almost the same
       --- no matter what the implementation is
       --- on both i.MX8 and i.MX6SX platforms.
      
      shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1 -l 64
      ------------------------------------------------------------
      Client connecting to 10.81.16.245, TCP port 5001
      TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
      ------------------------------------------------------------
      [  1] local 10.81.17.20 port 39728 connected with 10.81.16.245 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  1] 0.0000-1.0000 sec  37.0 MBytes   311 Mbits/sec
      [  1] 1.0000-2.0000 sec  36.6 MBytes   307 Mbits/sec
      [  1] 2.0000-3.0000 sec  37.2 MBytes   312 Mbits/sec
      [  1] 3.0000-4.0000 sec  37.1 MBytes   312 Mbits/sec
      [  1] 4.0000-5.0000 sec  37.2 MBytes   312 Mbits/sec
      [  1] 5.0000-6.0000 sec  37.2 MBytes   312 Mbits/sec
      [  1] 6.0000-7.0000 sec  37.2 MBytes   312 Mbits/sec
      [  1] 7.0000-8.0000 sec  37.2 MBytes   312 Mbits/sec
      [  1] 0.0000-8.0943 sec   299 MBytes   310 Mbits/sec
      
       --- Page Pool implementation on i.MX8 ----
      
      shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
      ------------------------------------------------------------
      Client connecting to 10.81.16.245, TCP port 5001
      TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
      ------------------------------------------------------------
      [  1] local 10.81.17.20 port 43204 connected with 10.81.16.245 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  1] 0.0000-1.0000 sec   111 MBytes   933 Mbits/sec
      [  1] 1.0000-2.0000 sec   111 MBytes   934 Mbits/sec
      [  1] 2.0000-3.0000 sec   112 MBytes   935 Mbits/sec
      [  1] 3.0000-4.0000 sec   111 MBytes   933 Mbits/sec
      [  1] 4.0000-5.0000 sec   111 MBytes   934 Mbits/sec
      [  1] 5.0000-6.0000 sec   111 MBytes   933 Mbits/sec
      [  1] 6.0000-7.0000 sec   111 MBytes   931 Mbits/sec
      [  1] 7.0000-8.0000 sec   112 MBytes   935 Mbits/sec
      [  1] 8.0000-9.0000 sec   111 MBytes   933 Mbits/sec
      [  1] 9.0000-10.0000 sec   112 MBytes   935 Mbits/sec
      [  1] 0.0000-10.0077 sec  1.09 GBytes   933 Mbits/sec
      
       --- Non Page Pool implementation on i.MX8 ----
      
      shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
      ------------------------------------------------------------
      Client connecting to 10.81.16.245, TCP port 5001
      TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
      ------------------------------------------------------------
      [  1] local 10.81.17.20 port 49154 connected with 10.81.16.245 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  1] 0.0000-1.0000 sec   104 MBytes   868 Mbits/sec
      [  1] 1.0000-2.0000 sec   105 MBytes   878 Mbits/sec
      [  1] 2.0000-3.0000 sec   105 MBytes   881 Mbits/sec
      [  1] 3.0000-4.0000 sec   105 MBytes   879 Mbits/sec
      [  1] 4.0000-5.0000 sec   105 MBytes   878 Mbits/sec
      [  1] 5.0000-6.0000 sec   105 MBytes   878 Mbits/sec
      [  1] 6.0000-7.0000 sec   104 MBytes   875 Mbits/sec
      [  1] 7.0000-8.0000 sec   104 MBytes   875 Mbits/sec
      [  1] 8.0000-9.0000 sec   104 MBytes   873 Mbits/sec
      [  1] 9.0000-10.0000 sec   104 MBytes   875 Mbits/sec
      [  1] 0.0000-10.0073 sec  1.02 GBytes   875 Mbits/sec
      
       --- Page Pool implementation on i.MX6SX ----
      
      shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1
      ------------------------------------------------------------
      Client connecting to 10.81.16.245, TCP port 5001
      TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
      ------------------------------------------------------------
      [  1] local 10.81.17.20 port 57288 connected with 10.81.16.245 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  1] 0.0000-1.0000 sec  78.8 MBytes   661 Mbits/sec
      [  1] 1.0000-2.0000 sec  82.5 MBytes   692 Mbits/sec
      [  1] 2.0000-3.0000 sec  82.4 MBytes   691 Mbits/sec
      [  1] 3.0000-4.0000 sec  82.4 MBytes   691 Mbits/sec
      [  1] 4.0000-5.0000 sec  82.5 MBytes   692 Mbits/sec
      [  1] 5.0000-6.0000 sec  82.4 MBytes   691 Mbits/sec
      [  1] 6.0000-7.0000 sec  82.5 MBytes   692 Mbits/sec
      [  1] 7.0000-8.0000 sec  82.4 MBytes   691 Mbits/sec
      [  1] 8.0000-9.0000 sec  82.4 MBytes   691 Mbits/sec
      [  1] 9.0000-9.5506 sec  45.0 MBytes   686 Mbits/sec
      [  1] 0.0000-9.5506 sec   783 MBytes   688 Mbits/sec
      
       --- Non Page Pool implementation on i.MX6SX ----
      
      shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1
      ------------------------------------------------------------
      Client connecting to 10.81.16.245, TCP port 5001
      TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
      ------------------------------------------------------------
      [  1] local 10.81.17.20 port 36486 connected with 10.81.16.245 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  1] 0.0000-1.0000 sec  70.5 MBytes   591 Mbits/sec
      [  1] 1.0000-2.0000 sec  64.5 MBytes   541 Mbits/sec
      [  1] 2.0000-3.0000 sec  73.6 MBytes   618 Mbits/sec
      [  1] 3.0000-4.0000 sec  73.6 MBytes   618 Mbits/sec
      [  1] 4.0000-5.0000 sec  72.9 MBytes   611 Mbits/sec
      [  1] 5.0000-6.0000 sec  73.4 MBytes   616 Mbits/sec
      [  1] 6.0000-7.0000 sec  73.5 MBytes   617 Mbits/sec
      [  1] 7.0000-8.0000 sec  73.4 MBytes   616 Mbits/sec
      [  1] 8.0000-9.0000 sec  73.4 MBytes   616 Mbits/sec
      [  1] 9.0000-10.0000 sec  73.9 MBytes   620 Mbits/sec
      [  1] 0.0000-10.0174 sec   723 MBytes   605 Mbits/sec
      Signed-off-by: NShenwei Wang <shenwei.wang@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95698ff6
  9. 30 9月, 2022 3 次提交
  10. 29 9月, 2022 1 次提交
  11. 21 9月, 2022 2 次提交
    • V
      net: enetc: deny offload of tc-based TSN features on VF interfaces · 5641c751
      Vladimir Oltean 提交于
      TSN features on the ENETC (taprio, cbs, gate, police) are configured
      through a mix of command BD ring messages and port registers:
      enetc_port_rd(), enetc_port_wr().
      
      Port registers are a region of the ENETC memory map which are only
      accessible from the PCIe Physical Function. They are not accessible from
      the Virtual Functions.
      
      Moreover, attempting to access these registers crashes the kernel:
      
      $ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
      pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
      fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
      fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
      fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
      $ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
      	sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
      Unable to handle kernel paging request at virtual address ffff800009551a08
      Internal error: Oops: 96000007 [#1] PREEMPT SMP
      pc : enetc_setup_tc_taprio+0x170/0x47c
      lr : enetc_setup_tc_taprio+0x16c/0x47c
      Call trace:
       enetc_setup_tc_taprio+0x170/0x47c
       enetc_setup_tc+0x38/0x2dc
       taprio_change+0x43c/0x970
       taprio_init+0x188/0x1e0
       qdisc_create+0x114/0x470
       tc_modify_qdisc+0x1fc/0x6c0
       rtnetlink_rcv_msg+0x12c/0x390
      
      Split enetc_setup_tc() into separate functions for the PF and for the
      VF drivers. Also remove enetc_qos.o from being included into
      enetc-vf.ko, since it serves absolutely no purpose there.
      
      Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      5641c751
    • V
      net: enetc: move enetc_set_psfp() out of the common enetc_set_features() · fed38e64
      Vladimir Oltean 提交于
      The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
      flag; only PFs should. Moreover, TSN-specific code should go to
      enetc_qos.c, which should not be included in the VF driver.
      
      Fixes: 79e49982 ("net: enetc: add hw tc hw offload features for PSPF capability")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      fed38e64
  12. 20 9月, 2022 6 次提交
  13. 16 9月, 2022 2 次提交
  14. 05 9月, 2022 12 次提交
  15. 03 9月, 2022 3 次提交
  16. 01 9月, 2022 1 次提交
  17. 27 8月, 2022 1 次提交
  18. 24 8月, 2022 1 次提交