1. 05 2月, 2022 14 次提交
  2. 04 2月, 2022 26 次提交
    • D
      Merge branch 'ipa-RX-replenish' · c531adaf
      David S. Miller 提交于
      Alex Elder says:
      
      ====================
      net: ipa: improve RX buffer replenishing
      
      This series revises the algorithm used for replenishing receive
      buffers on RX endpoints.  Currently there are two atomic variables
      that track how many receive buffers can be sent to the hardware.
      The new algorithm obviates the need for those, by just assuming we
      always want to provide the hardware with buffers until it can hold
      no more.
      
      The first patch eliminates an atomic variable that's not required.
      The next moves some code into the main replenish function's caller,
      making one of the called function's arguments unnecessary.   The
      next six refactor things a bit more, adding a new helper function
      that allows us to eliminate an additional atomic variable.  And the
      final two implement two more minor improvements.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c531adaf
    • A
      net: ipa: determine replenish doorbell differently · 9654d8c4
      Alex Elder 提交于
      Rather than tracking the number of receive buffer transactions that
      have been submitted without a doorbell, just track the total number
      of transactions that have been issued.  Then ring the doorbell when
      that number modulo the replenish batch size is 0.
      
      The effect is roughly the same, but the new count is slightly more
      interesting, and this approach will someday allow the replenish
      batch size to be tuned at runtime.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9654d8c4
    • A
      net: ipa: replenish after delivering payload · 5d6ac24f
      Alex Elder 提交于
      Replenishing is now solely driven by whether transactions are
      available for a channel, and it doesn't really matter whether
      we replenish before or after we deliver received packets to the
      network stack.
      
      Replenishing before delivering the payload adds a little latency.
      Eliminate that by requesting a replenish after the payload is
      delivered.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d6ac24f
    • A
      net: ipa: kill replenish_backlog · 09b337de
      Alex Elder 提交于
      We no longer use the replenish_backlog atomic variable to decide
      when we've got work to do providing receive buffers to hardware.
      Basically, we try to keep the hardware as full as possible, all the
      time.  We keep supplying buffers until the hardware has no more
      space for them.
      
      As a result, we can get rid of the replenish_backlog field and the
      atomic operations performed on it.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09b337de
    • A
      net: ipa: introduce gsi_channel_trans_idle() · 5fc7f9ba
      Alex Elder 提交于
      Create a new function that returns true if all transactions for a
      channel are available for use.
      
      Use it in ipa_endpoint_replenish_enable() to see whether to start
      replenishing, and in ipa_endpoint_replenish() to determine whether
      it's necessary after a failure to schedule delayed work to ensure a
      future replenish attempt occurs.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fc7f9ba
    • A
      net: ipa: don't use replenish_backlog · d0ac30e7
      Alex Elder 提交于
      Rather than determining when to stop replenishing using the
      replenish backlog, just stop when we have exhausted all available
      transactions.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0ac30e7
    • A
      net: ipa: allocate transaction in replenish loop · 6a606b90
      Alex Elder 提交于
      When replenishing, have ipa_endpoint_replenish() allocate a
      transaction, and pass that to ipa_endpoint_replenish_one() to fill.
      Then, if that produces no error, commit the transaction within the
      replenish loop as well.  In this way we can distinguish between
      transaction failures and buffer allocation/mapping failures.
      
      Failure to allocate a transaction simply means the hardware already
      has as many receive buffers as it can hold.  In that case we can
      break out of the replenish loop because there's nothing more to do.
      
      If we fail to allocate or map pages for the receive buffer, just
      try again later.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a606b90
    • A
      net: ipa: decide on doorbell in replenish loop · b9dbabc5
      Alex Elder 提交于
      Decide whether the doorbell should be signaled when committing a
      replenish transaction in the main replenish loop, rather than in
      ipa_endpoint_replenish_one().  This is a step to facilitate the
      next patch.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9dbabc5
    • A
      net: ipa: increment backlog in replenish caller · 4b22d841
      Alex Elder 提交于
      Three spots call ipa_endpoint_replenish(), and just one of those
      requests that the backlog be incremented after completing the
      replenish operation.
      
      Instead, have the caller increment the backlog, and get rid of the
      add_one argument to ipa_endpoint_replenish().
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b22d841
    • A
      net: ipa: allocate transaction before pages when replenishing · b4061c13
      Alex Elder 提交于
      A transaction failure only occurs if no more transactions are
      available for an endpoint.  It's a very cheap test.
      
      When replenishing an RX endpoint buffer, there's no point in
      allocating pages if transactions are exhausted.  So don't bother
      doing so unless the transaction allocation succeeds.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4061c13
    • A
      net: ipa: kill replenish_saved · a9bec7ae
      Alex Elder 提交于
      The replenish_saved field keeps track of the number of times a new
      buffer is added to the backlog when replenishing is disabled.  We
      don't really use it though, so there's no need for us to track it
      separately.  Whether replenishing is enabled or not, we can simply
      increment the backlog.
      
      Get rid of replenish_saved, and initialize and increment the backlog
      where it would have otherwise been used.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9bec7ae
    • J
      tls: cap the output scatter list to something reasonable · b93235e6
      Jakub Kicinski 提交于
      TLS recvmsg() passes user pages as destination for decrypt.
      The decrypt operation is repeated record by record, each
      record being 16kB, max. TLS allocates an sg_table and uses
      iov_iter_get_pages() to populate it with enough pages to
      fit the decrypted record.
      
      Even though we decrypt a single message at a time we size
      the sg_table based on the entire length of the iovec.
      This leads to unnecessarily large allocations, risking
      triggering OOM conditions.
      
      Use iov_iter_truncate() / iov_iter_reexpand() to construct
      a "capped" version of iov_iter_npages(). Alternatively we
      could parametrize iov_iter_npages() to take the size as
      arg instead of using i->count, or do something else..
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b93235e6
    • R
      net: dsa: realtek: convert to phylink_generic_validate() · 6ff60646
      Russell King (Oracle) 提交于
      Populate the supported interfaces and MAC capabilities for the Realtek
      rtl8365 DSA switch and remove the old validate implementation to allow
      DSA to use phylink_generic_validate() for this switch driver.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ff60646
    • D
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · eace555b
      David S. Miller 提交于
      Tony Nguyen says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2022-02-03
      
      This series contains updates to the i40e client header file and driver.
      
      Mateusz disables HW TC offload by default.
      
      Joe Damato removes a no longer used statistic.
      
      Jakub Kicinski removes an unused enum from the client header file.
      
      Jedrzej changes some admin queue commands to occur under atomic context
      and adds new functions for admin queue MAC VLAN filters to avoid a
      potential race that could occur due storing results in a structure that
      could be overwritten by the next admin queue call.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eace555b
    • H
      net: lan966x: use .mac_select_pcs() interface · 41414c9b
      Horatiu Vultur 提交于
      Convert lan966x to use the mac_select_interface instead of
      phylink_set_pcs.
      Signed-off-by: NHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20220202114949.833075-1-horatiu.vultur@microchip.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      41414c9b
    • G
      selftests: rtnetlink: Use more sensible tos values · 95eb6ef8
      Guillaume Nault 提交于
      Using tos 0x1 with 'ip route get <IPv4 address> ...' doesn't test much
      of the tos option handling: 0x1 just sets an ECN bit, which is cleared
      by inet_rtm_getroute() before doing the fib lookup. Let's use 0x10
      instead, which is actually taken into account in the route lookup (and
      is less surprising for the reader).
      
      For consistency, use 0x10 for the IPv6 route lookup too (IPv6 currently
      doesn't clear ECN bits, but might do so in the future).
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Link: https://lore.kernel.org/r/d61119e68d01ba7ef3ba50c1345a5123a11de123.1643815297.git.gnault@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      95eb6ef8
    • G
      selftests: fib offload: use sensible tos values · bafe517a
      Guillaume Nault 提交于
      Although both iproute2 and the kernel accept 1 and 2 as tos values for
      new routes, those are invalid. These values only set ECN bits, which
      are ignored during IPv4 fib lookups. Therefore, no packet can actually
      match such routes. This selftest therefore only succeeds because it
      doesn't verify that the new routes do actually work in practice (it
      just checks if the routes are offloaded or not).
      
      It makes more sense to use tos values that don't conflict with ECN.
      This way, the selftest won't be affected if we later decide to warn or
      even reject invalid tos configurations for new routes.
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Reviewed-by: NIdo Schimmel <idosch@nvidia.com>
      Link: https://lore.kernel.org/r/5e43b343720360a1c0e4f5947d9e917b26f30fbf.1643826556.git.gnault@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      bafe517a
    • E
      net: minor __dev_alloc_name() optimization · 25ee1660
      Eric Dumazet 提交于
      __dev_alloc_name() allocates a private zeroed page,
      then sets bits in it while iterating through net devices.
      
      It can use __set_bit() to avoid unnecessary locked operations.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220203064609.3242863-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      25ee1660
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c59400a6
      Jakub Kicinski 提交于
      No conflicts.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      c59400a6
    • K
      gcc-plugins/stackleak: Use noinstr in favor of notrace · dcb85f85
      Kees Cook 提交于
      While the stackleak plugin was already using notrace, objtool is now a
      bit more picky.  Update the notrace uses to noinstr.  Silences the
      following objtool warnings when building with:
      
      CONFIG_DEBUG_ENTRY=y
      CONFIG_STACK_VALIDATION=y
      CONFIG_VMLINUX_VALIDATION=y
      CONFIG_GCC_PLUGIN_STACKLEAK=y
      
        vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
        vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
        vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section
        vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section
        vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section
        vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section
        vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section
        vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section
        vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section
      
      Note that the plugin's addition of calls to stackleak_track_stack() from
      noinstr functions is expected to be safe, as it isn't runtime
      instrumentation and is self-contained.
      
      Cc: Alexander Popov <alex.popov@linux.com>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dcb85f85
    • L
      Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · eb2eb516
      Linus Torvalds 提交于
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf, netfilter, and ieee802154.
      
        Current release - regressions:
      
         - Partially revert "net/smc: Add netlink net namespace support", fix
           uABI breakage
      
         - netfilter:
            - nft_ct: fix use after free when attaching zone template
            - nft_byteorder: track register operations
      
        Previous releases - regressions:
      
         - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
      
         - phy: qca8081: fix speeds lower than 2.5Gb/s
      
         - sched: fix use-after-free in tc_new_tfilter()
      
        Previous releases - always broken:
      
         - tcp: fix mem under-charging with zerocopy sendmsg()
      
         - tcp: add missing tcp_skb_can_collapse() test in
           tcp_shift_skb_data()
      
         - neigh: do not trigger immediate probes on NUD_FAILED from
           neigh_managed_work, avoid a deadlock
      
         - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
           false-positives
      
         - netfilter: nft_reject_bridge: fix for missing reply from prerouting
      
         - smc: forward wakeup to smc socket waitqueue after fallback
      
         - ieee802154:
            - return meaningful error codes from the netlink helpers
            - mcr20a: fix lifs/sifs periods
            - at86rf230, ca8210: stop leaking skbs on error paths
      
         - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
      
         - ax25: add refcount in ax25_dev to avoid UAF bugs
      
         - eth: mlx5e:
            - fix SFP module EEPROM query
            - fix broken SKB allocation in HW-GRO
            - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
      
         - eth: amd-xgbe:
            - fix skb data length underflow
            - ensure reset of the tx_timer_active flag, avoid Tx timeouts
      
         - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
      
         - eth: e1000e: handshake with CSME starts from Alder Lake platforms"
      
      * tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
        ax25: fix reference count leaks of ax25_dev
        net: stmmac: ensure PTP time register reads are consistent
        net: ipa: request IPA register values be retained
        dt-bindings: net: qcom,ipa: add optional qcom,qmp property
        tools/resolve_btfids: Do not print any commands when building silently
        bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
        net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
        tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
        net: sparx5: do not refer to skb after passing it on
        Partially revert "net/smc: Add netlink net namespace support"
        net/mlx5e: Avoid field-overflowing memcpy()
        net/mlx5e: Use struct_group() for memcpy() region
        net/mlx5e: Avoid implicit modify hdr for decap drop rule
        net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
        net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
        net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
        net/mlx5: E-Switch, Fix uninitialized variable modact
        net/mlx5e: Fix handling of wrong devices during bond netevent
        net/mlx5e: Fix broken SKB allocation in HW-GRO
        net/mlx5e: Fix wrong calculation of header index in HW_GRO
        ...
      eb2eb516
    • L
      Merge tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 551007a8
      Linus Torvalds 提交于
      Pull selinux fix from Paul Moore:
       "One small SELinux patch to ensure that a policy structure field is
        properly reset after freeing so that we don't inadvertently do a
        double-free on certain error conditions"
      
      * tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix double free of cond_list on error paths
      551007a8
    • L
      Merge tag 'linux-kselftest-fixes-5.17-rc3' of... · 25b20ae8
      Linus Torvalds 提交于
      Merge tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fixes from Shuah Khan:
       "Important fixes to several tests and documentation clarification on
        running mainline kselftest on stable releases. A few notable fixes:
      
         - fix kselftest run hang due to child processes that haven't been
           terminated. Fix signals all child processes
      
         - fix false pass/fail results from vdso_test_abi, openat2, mincore
      
         - build failures when using -j (multiple jobs) option
      
         - exec test build failure due to incorrect build rule for a run-time
           created "pipe"
      
         - zram test fixes related to interaction with zram-generator to make
           sure zram test to coordinate deleted with zram-generator
      
         - zram test compression ratio calculation fix and skipping
           max_comp_streams.
      
         - increasing rtc test timeout
      
         - cpufreq test to write test results to stdout which will necessary
           on automated test systems"
      
      * tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kselftest: Fix vdso_test_abi return status
        selftests: skip mincore.check_file_mmap when fs lacks needed support
        selftests: openat2: Skip testcases that fail with EOPNOTSUPP
        selftests: openat2: Add missing dependency in Makefile
        selftests: openat2: Print also errno in failure messages
        selftests: futex: Use variable MAKE instead of make
        selftests/exec: Remove pipe from TEST_GEN_FILES
        selftests/zram: Adapt the situation that /dev/zram0 is being used
        selftests/zram01.sh: Fix compression ratio calculation
        selftests/zram: Skip max_comp_streams interface on newer kernel
        docs/kselftest: clarify running mainline tests on stables
        kselftest: signal all child processes
        selftests: cpufreq: Write test output to stdout as well
        selftests: rtc: Increase test timeout so that all tests run
      25b20ae8
    • D
      ax25: fix reference count leaks of ax25_dev · 87563a04
      Duoming Zhou 提交于
      The previous commit d01ffb9e ("ax25: add refcount in ax25_dev
      to avoid UAF bugs") introduces refcount into ax25_dev, but there
      are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
      ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().
      
      This patch uses ax25_dev_put() and adjusts the position of
      ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.
      
      Fixes: d01ffb9e ("ax25: add refcount in ax25_dev to avoid UAF bugs")
      Signed-off-by: NDuoming Zhou <duoming@zju.edu.cn>
      Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      87563a04
    • Y
      net: stmmac: ensure PTP time register reads are consistent · 80d46090
      Yannick Vignon 提交于
      Even if protected from preemption and interrupts, a small time window
      remains when the 2 register reads could return inconsistent values,
      each time the "seconds" register changes. This could lead to an about
      1-second error in the reported time.
      
      Add logic to ensure the "seconds" and "nanoseconds" values are consistent.
      
      Fixes: 92ba6888 ("stmmac: add the support for PTP hw clock driver")
      Signed-off-by: NYannick Vignon <yannick.vignon@nxp.com>
      Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      80d46090
    • J
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 77b1b8b4
      Jakub Kicinski 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2022-02-03
      
      We've added 6 non-merge commits during the last 10 day(s) which contain
      a total of 7 files changed, 11 insertions(+), 236 deletions(-).
      
      The main changes are:
      
      1) Fix BPF ringbuf to allocate its area with VM_MAP instead of VM_ALLOC
         flag which otherwise trips over KASAN, from Hou Tao.
      
      2) Fix unresolved symbol warning in resolve_btfids due to LSM callback
         rename, from Alexei Starovoitov.
      
      3) Fix a possible race in inc_misses_counter() when IRQ would trigger
         during counter update, from He Fengqing.
      
      4) Fix tooling infra for cross-building with clang upon probing whether
         gcc provides the standard libraries, from Jean-Philippe Brucker.
      
      5) Fix silent mode build for resolve_btfids, from Nathan Chancellor.
      
      6) Drop unneeded and outdated lirc.h header copy from tooling infra as
         BPF does not require it anymore, from Sean Young.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        tools/resolve_btfids: Do not print any commands when building silently
        bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
        tools: Ignore errors from `which' when searching a GCC toolchain
        tools headers UAPI: remove stale lirc.h
        bpf: Fix possible race in inc_misses_counter
        bpf: Fix renaming task_getsecid_subj->current_getsecid_subj.
      ====================
      
      Link: https://lore.kernel.org/r/20220203155815.25689-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      77b1b8b4