1. 01 11月, 2021 6 次提交
  2. 29 10月, 2021 19 次提交
    • Y
      cls_flower: Fix inability to match GRE/IPIP packets · 6de6e46d
      Yoshiki Komachi 提交于
      When a packet of a new flow arrives in openvswitch kernel module, it dissects
      the packet and passes the extracted flow key to ovs-vswtichd daemon. If hw-
      offload configuration is enabled, the daemon creates a new TC flower entry to
      bypass openvswitch kernel module for the flow (TC flower can also offload flows
      to NICs but this time that does not matter).
      
      In this processing flow, I found the following issue in cases of GRE/IPIP
      packets.
      
      When ovs_flow_key_extract() in openvswitch module parses a packet of a new
      GRE (or IPIP) flow received on non-tunneling vports, it extracts information
      of the outer IP header for ip_proto/src_ip/dst_ip match keys.
      
      This means ovs-vswitchd creates a TC flower entry with IP protocol/addresses
      match keys whose values are those of the outer IP header. OTOH, TC flower,
      which uses flow_dissector (different parser from openvswitch module), extracts
      information of the inner IP header.
      
      The following flow is an example to describe the issue in more detail.
      
         <----------- Outer IP -----------------> <---------- Inner IP ---------->
        +----------+--------------+--------------+----------+----------+----------+
        | ip_proto | src_ip       | dst_ip       | ip_proto | src_ip   | dst_ip   |
        | 47 (GRE) | 192.168.10.1 | 192.168.10.2 | 6 (TCP)  | 10.0.0.1 | 10.0.0.2 |
        +----------+--------------+--------------+----------+----------+----------+
      
      In this case, TC flower entry and extracted information are shown as below:
      
        - ovs-vswitchd creates TC flower entry with:
            - ip_proto: 47
            - src_ip: 192.168.10.1
            - dst_ip: 192.168.10.2
      
        - TC flower extracts below for IP header matches:
            - ip_proto: 6
            - src_ip: 10.0.0.1
            - dst_ip: 10.0.0.2
      
      Thus, GRE or IPIP packets never match the TC flower entry, as each
      dissector behaves differently.
      
      IMHO, the behavior of TC flower (flow dissector) does not look correct,
      as ip_proto/src_ip/dst_ip in TC flower match means the outermost IP
      header information except for GRE/IPIP cases. This patch adds a new
      flow_dissector flag FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP which skips
      dissection of the encapsulated inner GRE/IPIP header in TC flower
      classifier.
      Signed-off-by: NYoshiki Komachi <komachi.yoshiki@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6de6e46d
    • N
      selftests: net: bridge: update IGMP/MLD membership interval value · 34d7ecb3
      Nikolay Aleksandrov 提交于
      When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
      value which is chosen by user-space instead of calculating it based on
      multicast_query_interval and multicast_query_response_interval I forgot
      to update the selftests relying on that behaviour. Now we have to
      manually set the expected GMI value to perform the tests correctly and get
      proper results (similar to IGMPv2 behaviour).
      
      Fixes: fac3cb82 ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
      Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34d7ecb3
    • I
      net: bridge: fix uninitialized variables when BRIDGE_CFM is disabled · 829e050e
      Ivan Vecera 提交于
      Function br_get_link_af_size_filtered() calls br_cfm_{,peer}_mep_count()
      that return a count. When BRIDGE_CFM is not enabled these functions
      simply return -EOPNOTSUPP but do not modify count parameter and
      calling function then works with uninitialized variables.
      Modify these inline functions to return zero in count parameter.
      
      Fixes: b6d0425b ("bridge: cfm: Netlink Notifications.")
      Cc: Henrik Bjoernlund <henrik.bjoernlund@microchip.com>
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      829e050e
    • R
      net: phylink: avoid mvneta warning when setting pause parameters · fd8d9731
      Russell King (Oracle) 提交于
      mvneta does not support asymetric pause modes, and it flags this by the
      lack of AsymPause in the supported field. When setting pause modes, we
      check that pause->rx_pause == pause->tx_pause, but only when pause
      autoneg is enabled. When pause autoneg is disabled, we still allow
      pause->rx_pause != pause->tx_pause, which is incorrect when the MAC
      does not support asymetric pause, and causes mvneta to issue a warning.
      
      Fix this by removing the test for pause->autoneg, so we always check
      that pause->rx_pause == pause->tx_pause for network devices that do not
      support AsymPause.
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd8d9731
    • D
      Merge branch 'nfp-fixes' · 0f48fb66
      David S. Miller 提交于
      Simon Horman says:
      
      ====================
      nfp: fix bugs caused by adaptive coalesce
      
      this series contains fixes for two bugs introduced when
      when adaptive coalesce support was added to the NFP driver in
      v5.15 by 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f48fb66
    • Y
      nfp: fix potential deadlock when canceling dim work · 17e712c6
      Yinjun Zhang 提交于
      When port is linked down, the process which has acquired rtnl_lock
      will wait for the in-progress dim work to finish, and the work also
      acquires rtnl_lock, which may cause deadlock.
      
      Currently IRQ_MOD registers can be configured by `ethtool -C` and
      dim work, and which will take effect depends on the execution order,
      rtnl_lock is useless here, so remove them.
      
      Fixes: 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      Signed-off-by: NYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: NLouis Peens <louis.peens@corigine.com>
      Signed-off-by: NSimon Horman <simon.horman@corigine.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17e712c6
    • Y
      nfp: fix NULL pointer access when scheduling dim work · f8d384a6
      Yinjun Zhang 提交于
      Each rx/tx ring has a related dim work, when rx/tx ring number is
      decreased by `ethtool -L`, the corresponding rx_ring or tx_ring is
      assigned NULL, while its related work is not destroyed. When scheduled,
      the work will access NULL pointer.
      
      Fixes: 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      Signed-off-by: NYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: NLouis Peens <louis.peens@corigine.com>
      Signed-off-by: NSimon Horman <simon.horman@corigine.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8d384a6
    • S
      selftests/net: update .gitignore with newly added tests · e300a85d
      Shuah Khan 提交于
      Update .gitignore with newly added tests:
      	tools/testing/selftests/net/af_unix/test_unix_oob
      	tools/testing/selftests/net/gro
      	tools/testing/selftests/net/ioam6_parser
      	tools/testing/selftests/net/toeplitz
      Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e300a85d
    • S
      net: amd-xgbe: Toggle PLL settings during rate change · daf182d3
      Shyam Sundar S K 提交于
      For each rate change command submission, the FW has to do a phy
      power off sequence internally. For this to happen correctly, the
      PLL re-initialization control setting has to be turned off before
      sending mailbox commands and re-enabled once the command submission
      is complete.
      
      Without the PLL control setting, the link up takes longer time in a
      fixed phy configuration.
      
      Fixes: 47f164de ("amd-xgbe: Add PCI device support")
      Co-developed-by: NSudheesh Mavila <sudheesh.mavila@amd.com>
      Signed-off-by: NSudheesh Mavila <sudheesh.mavila@amd.com>
      Signed-off-by: NShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Acked-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      daf182d3
    • D
      Merge branch 'sctp-plpmtud-fixes' · cec6880d
      David S. Miller 提交于
      Xin Long says:
      
      ====================
      sctp: a couple of fixes for PLPMTUD
      
      Four fixes included in this patchset:
      
        - fix the packet sending in Error state.
        - fix the timer stop when transport update dst.
        - fix the outer header len calculation.
        - fix the return value for toobig processing.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cec6880d
    • X
      sctp: return true only for pathmtu update in sctp_transport_pl_toobig · 75cf662c
      Xin Long 提交于
      sctp_transport_pl_toobig() supposes to return true only if there's
      pathmtu update, so that in sctp_icmp_frag_needed() it would call
      sctp_assoc_sync_pmtu() and sctp_retransmit(). This patch is to fix
      these return places in sctp_transport_pl_toobig().
      
      Fixes: 83696408 ("sctp: do state transition when receiving an icmp TOOBIG packet")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75cf662c
    • X
      sctp: subtract sctphdr len in sctp_transport_pl_hlen · cc4665ca
      Xin Long 提交于
      sctp_transport_pl_hlen() is called to calculate the outer header length
      for PL. However, as the Figure in rfc8899#section-4.4:
      
         Any additional
           headers         .--- MPS -----.
                  |        |             |
                  v        v             v
           +------------------------------+
           | IP | ** | PL | protocol data |
           +------------------------------+
      
                      <----- PLPMTU ----->
           <---------- PMTU -------------->
      
      Outer header are IP + Any additional headers, which doesn't include
      Packetization Layer itself header, namely sctphdr, whereas sctphdr
      is counted by __sctp_mtu_payload().
      
      The incorrect calculation caused the link pathmtu to be set larger
      than expected by t->pl.pmtu + sctp_transport_pl_hlen(). This patch
      is to fix it by subtracting sctphdr len in sctp_transport_pl_hlen().
      
      Fixes: d9e2e410 ("sctp: add the constants/variables and states and some APIs for transport")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4665ca
    • X
      sctp: reset probe_timer in sctp_transport_pl_update · c6ea04ea
      Xin Long 提交于
      sctp_transport_pl_update() is called when transport update its dst and
      pathmtu, instead of stopping the PLPMTUD probe timer, PLPMTUD should
      start over and reset the probe timer. Otherwise, the PLPMTUD service
      would stop.
      
      Fixes: 92548ec2 ("sctp: add the probe timer in transport for PLPMTUD")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6ea04ea
    • X
      sctp: allow IP fragmentation when PLPMTUD enters Error state · 40171248
      Xin Long 提交于
      Currently when PLPMTUD enters Error state, transport pathmtu will be set
      to MIN_PLPMTU(512) while probe is continuing with BASE_PLPMTU(1200). It
      will cause pathmtu to stay in a very small value, even if the real pmtu
      is some value like 1000.
      
      RFC8899 doesn't clearly say how to set the value in Error state. But one
      possibility could be keep using BASE_PLPMTU for the real pmtu, but allow
      to do IP fragmentation when it's in Error state.
      
      As it says in rfc8899#section-5.4:
      
         Some paths could be unable to sustain packets of the BASE_PLPMTU
         size.  The Error State could be implemented to provide robustness to
         such paths.  This allows fallback to a smaller than desired PLPMTU
         rather than suffer connectivity failure.  This could utilize methods
         such as endpoint IP fragmentation to enable the PL sender to
         communicate using packets smaller than the BASE_PLPMTU.
      
      This patch is to set pmtu to BASE_PLPMTU instead of MIN_PLPMTU for Error
      state in sctp_transport_pl_send/toobig(), and set packet ipfragok for
      non-probe packets when it's in Error state.
      
      Fixes: 1dc68c19 ("sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40171248
    • L
      Merge tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 411a44c2
      Linus Torvalds 提交于
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from WiFi (mac80211), and BPF.
      
        Current release - regressions:
      
         - skb_expand_head: adjust skb->truesize to fix socket memory
           accounting
      
         - mptcp: fix corrupt receiver key in MPC + data + checksum
      
        Previous releases - regressions:
      
         - multicast: calculate csum of looped-back and forwarded packets
      
         - cgroup: fix memory leak caused by missing cgroup_bpf_offline
      
         - cfg80211: fix management registrations locking, prevent list
           corruption
      
         - cfg80211: correct false positive in bridge/4addr mode check
      
         - tcp_bpf: fix race in the tcp_bpf_send_verdict resulting in reusing
           previous verdict
      
        Previous releases - always broken:
      
         - sctp: enhancements for the verification tag, prevent attackers from
           killing SCTP sessions
      
         - tipc: fix size validations for the MSG_CRYPTO type
      
         - mac80211: mesh: fix HE operation element length check, prevent out
           of bound access
      
         - tls: fix sign of socket errors, prevent positive error codes being
           reported from read()/write()
      
         - cfg80211: scan: extend RCU protection in
           cfg80211_add_nontrans_list()
      
         - implement ->sock_is_readable() for UDP and AF_UNIX, fix poll() for
           sockets in a BPF sockmap
      
         - bpf: fix potential race in tail call compatibility check resulting
           in two operations which would make the map incompatible succeeding
      
         - bpf: prevent increasing bpf_jit_limit above max
      
         - bpf: fix error usage of map_fd and fdget() in generic batch update
      
         - phy: ethtool: lock the phy for consistency of results
      
         - prevent infinite while loop in skb_tx_hash() when Tx races with
           driver reconfiguring the queue <> traffic class mapping
      
         - usbnet: fixes for bad HW conjured by syzbot
      
         - xen: stop tx queues during live migration, prevent UAF
      
         - net-sysfs: initialize uid and gid before calling
           net_ns_get_ownership
      
         - mlxsw: prevent Rx stalls under memory pressure"
      
      * tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
        Revert "net: hns3: fix pause config problem after autoneg disabled"
        mptcp: fix corrupt receiver key in MPC + data + checksum
        riscv, bpf: Fix potential NULL dereference
        octeontx2-af: Fix possible null pointer dereference.
        octeontx2-af: Display all enabled PF VF rsrc_alloc entries.
        octeontx2-af: Check whether ipolicers exists
        net: ethernet: microchip: lan743x: Fix skb allocation failure
        net/tls: Fix flipped sign in async_wait.err assignment
        net/tls: Fix flipped sign in tls_err_abort() calls
        net/smc: Correct spelling mistake to TCPF_SYN_RECV
        net/smc: Fix smc_link->llc_testlink_time overflow
        nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
        vmxnet3: do not stop tx queues after netif_device_detach()
        r8169: Add device 10ec:8162 to driver r8169
        ptp: Document the PTP_CLK_MAGIC ioctl number
        usbnet: fix error return code in usbnet_probe()
        net: hns3: adjust string spaces of some parameters of tx bd info in debugfs
        net: hns3: expand buffer len for some debugfs command
        net: hns3: add more string spaces for dumping packets number of queue info in debugfs
        net: hns3: fix data endian problem of some functions of debugfs
        ...
      411a44c2
    • L
      Merge tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 4fb7d85b
      Linus Torvalds 提交于
      Pull spi fixes from Mark Brown:
       "A couple of final driver specific fixes for v5.15, one fixing
        potential ID collisions between two instances of the Altera driver and
        one making Microwire full duplex mode actually work on pl022"
      
      * tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spl022: fix Microwire full duplex mode
        spi: altera: Change to dynamic allocation of spi id
      4fb7d85b
    • L
      Merge tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 8685de2e
      Linus Torvalds 提交于
      Pull regmap fix from Mark Brown:
       "This fixes a potential double free when handling an out of memory
        error inserting a node into an rbtree regcache"
      
      * tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: Fix possible double-free in regcache_rbtree_exit()
      8685de2e
    • L
      Merge tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog · eecd231a
      Linus Torvalds 提交于
      Pull watchdog fixes from Wim Van Sebroeck:
       "I overlooked Guenters request to sent this upstream earlier, so it's a
        bit late in the release cycle.
      
        This contains:
      
         - Revert "watchdog: iTCO_wdt: Account for rebooting on second
           timeout"
      
         - sbsa: only use 32-bit accessors
      
         - sbsa: drop unneeded MODULE_ALIAS
      
         - ixp4xx_wdt: Fix address space warning
      
         - Fix OMAP watchdog early handling"
      
      * tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog:
        watchdog: Fix OMAP watchdog early handling
        watchdog: ixp4xx_wdt: Fix address space warning
        watchdog: sbsa: drop unneeded MODULE_ALIAS
        watchdog: sbsa: only use 32-bit accessors
        Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"
      eecd231a
    • L
      Merge tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · fc18cc89
      Linus Torvalds 提交于
      Pull tracing fix from Steven Rostedt:
       "Do not WARN when attaching event probe to non-existent event
      
        If the user tries to attach an event probe (eprobe) to an event that
        does not exist, it will trigger a warning. There's an error check that
        only expects memory issues otherwise it is considered a bug. But
        changes in the code to move around the locking made it that it can
        error out if the user attempts to attach to an event that does not
        exist, returning an -ENODEV. As this path can be caused by user space
        putting in a bad value, do not trigger a WARN"
      
      * tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Do not warn when connecting eprobe to non existing event
      fc18cc89
  3. 28 10月, 2021 15 次提交