1. 01 6月, 2022 9 次提交
    • H
      bonding: guard ns_targets by CONFIG_IPV6 · c4caa500
      Hangbin Liu 提交于
      Guard ns_targets in struct bond_params by CONFIG_IPV6, which could save
      256 bytes if IPv6 not configed. Also add this protection for function
      bond_is_ip6_target_ok() and bond_get_targets_ip6().
      
      Remove the IS_ENABLED() check for bond_opts[] as this will make
      BOND_OPT_NS_TARGETS uninitialized if CONFIG_IPV6 not enabled. Add
      a dummy bond_option_ns_ip6_targets_set() for this situation.
      
      Fixes: 4e24be01 ("bonding: add new parameter ns_targets")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Acked-by: NJonathan Toppins <jtoppins@redhat.com>
      Link: https://lore.kernel.org/r/20220531063727.224043-1-liuhangbin@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
      c4caa500
    • E
      tcp: tcp_rtx_synack() can be called from process context · 0a375c82
      Eric Dumazet 提交于
      Laurent reported the enclosed report [1]
      
      This bug triggers with following coditions:
      
      0) Kernel built with CONFIG_DEBUG_PREEMPT=y
      
      1) A new passive FastOpen TCP socket is created.
         This FO socket waits for an ACK coming from client to be a complete
         ESTABLISHED one.
      2) A socket operation on this socket goes through lock_sock()
         release_sock() dance.
      3) While the socket is owned by the user in step 2),
         a retransmit of the SYN is received and stored in socket backlog.
      4) At release_sock() time, the socket backlog is processed while
         in process context.
      5) A SYNACK packet is cooked in response of the SYN retransmit.
      6) -> tcp_rtx_synack() is called in process context.
      
      Before blamed commit, tcp_rtx_synack() was always called from BH handler,
      from a timer handler.
      
      Fix this by using TCP_INC_STATS() & NET_INC_STATS()
      which do not assume caller is in non preemptible context.
      
      [1]
      BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
      caller is tcp_rtx_synack.part.0+0x36/0xc0
      CPU: 10 PID: 2180 Comm: epollpep Tainted: G           OE     5.16.0-0.bpo.4-amd64 #1  Debian 5.16.12-1~bpo11+1
      Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
      Call Trace:
       <TASK>
       dump_stack_lvl+0x48/0x5e
       check_preemption_disabled+0xde/0xe0
       tcp_rtx_synack.part.0+0x36/0xc0
       tcp_rtx_synack+0x8d/0xa0
       ? kmem_cache_alloc+0x2e0/0x3e0
       ? apparmor_file_alloc_security+0x3b/0x1f0
       inet_rtx_syn_ack+0x16/0x30
       tcp_check_req+0x367/0x610
       tcp_rcv_state_process+0x91/0xf60
       ? get_nohz_timer_target+0x18/0x1a0
       ? lock_timer_base+0x61/0x80
       ? preempt_count_add+0x68/0xa0
       tcp_v4_do_rcv+0xbd/0x270
       __release_sock+0x6d/0xb0
       release_sock+0x2b/0x90
       sock_setsockopt+0x138/0x1140
       ? __sys_getsockname+0x7e/0xc0
       ? aa_sk_perm+0x3e/0x1a0
       __sys_setsockopt+0x198/0x1e0
       __x64_sys_setsockopt+0x21/0x30
       do_syscall_64+0x38/0xc0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NLaurent Fasnacht <laurent.fasnacht@proton.ch>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0a375c82
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · b3c0a9ef
      Jakub Kicinski 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Missing proper sanitization for nft_set_desc_concat_parse().
      
      2) Missing mutex in nf_tables pre_exit path.
      
      3) Possible double hook unregistration from clean_net path.
      
      4) Missing FLOWI_FLAG_ANYSRC flag in flowtable route lookup.
         Fix incorrect source and destination address in case of NAT.
         Patch from wenxu.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: flowtable: fix nft_flow_route source address for nat case
        netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag
        netfilter: nf_tables: double hook unregistration in netns path
        netfilter: nf_tables: hold mutex on netns pre_exit path
        netfilter: nf_tables: sanitize nft_set_desc_concat_parse()
      ====================
      
      Link: https://lore.kernel.org/r/20220531215839.84765-1-pablo@netfilter.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      b3c0a9ef
    • G
      net: sched: add barrier to fix packet stuck problem for lockless qdisc · 2e8728c9
      Guoju Fang 提交于
      In qdisc_run_end(), the spin_unlock() only has store-release semantic,
      which guarantees all earlier memory access are visible before it. But
      the subsequent test_bit() has no barrier semantics so may be reordered
      ahead of the spin_unlock(). The store-load reordering may cause a packet
      stuck problem.
      
      The concurrent operations can be described as below,
               CPU 0                      |          CPU 1
         qdisc_run_end()                  |     qdisc_run_begin()
                .                         |           .
       ----> /* may be reorderd here */   |           .
      |         .                         |           .
      |     spin_unlock()                 |         set_bit()
      |         .                         |         smp_mb__after_atomic()
       ---- test_bit()                    |         spin_trylock()
                .                         |          .
      
      Consider the following sequence of events:
          CPU 0 reorder test_bit() ahead and see MISSED = 0
          CPU 1 calls set_bit()
          CPU 1 calls spin_trylock() and return fail
          CPU 0 executes spin_unlock()
      
      At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
      because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
      take it, until the next cpu pushing to the qdisc (if ever ...) will
      notice and dequeue it.
      
      This patch fix this by adding one explicit barrier. As spin_unlock() and
      test_bit() ordering is a store-load ordering, a full memory barrier
      smp_mb() is needed here.
      
      Fixes: a90c57f2 ("net: sched: fix packet stuck problem for lockless qdisc")
      Signed-off-by: NGuoju Fang <gjfang@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20220528101628.120193-1-gjfang@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2e8728c9
    • W
      netfilter: flowtable: fix nft_flow_route source address for nat case · 97629b23
      wenxu 提交于
      For snat and dnat cases, the saddr should be taken from reverse tuple.
      
      Fixes: 3412e164 (netfilter: flowtable: nft_flow_route use more data for reverse route)
      Signed-off-by: Nwenxu <wenxu@chinatelecom.cn>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      97629b23
    • W
      netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag · f1896d45
      wenxu 提交于
      The nf_flow_table gets route through ip_route_output_key. If the saddr
      is not local one, then FLOWI_FLAG_ANYSRC flag should be set. Without
      this flag, the route lookup for other_dst will fail.
      
      Fixes: 3412e164 (netfilter: flowtable: nft_flow_route use more data for reverse route)
      Signed-off-by: Nwenxu <wenxu@chinatelecom.cn>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f1896d45
    • P
      netfilter: nf_tables: double hook unregistration in netns path · f9a43007
      Pablo Neira Ayuso 提交于
      __nft_release_hooks() is called from pre_netns exit path which
      unregisters the hooks, then the NETDEV_UNREGISTER event is triggered
      which unregisters the hooks again.
      
      [  565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270
      [...]
      [  565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G            E     5.18.0-rc7+ #27
      [  565.253682] Workqueue: netns cleanup_net
      [  565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270
      [...]
      [  565.297120] Call Trace:
      [  565.300900]  <TASK>
      [  565.304683]  nf_tables_flowtable_event+0x16a/0x220 [nf_tables]
      [  565.308518]  raw_notifier_call_chain+0x63/0x80
      [  565.312386]  unregister_netdevice_many+0x54f/0xb50
      
      Unregister and destroy netdev hook from netns pre_exit via kfree_rcu
      so the NETDEV_UNREGISTER path see unregistered hooks.
      
      Fixes: 767d1216 ("netfilter: nftables: fix possible UAF over chains from packet path in netns")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f9a43007
    • P
      netfilter: nf_tables: hold mutex on netns pre_exit path · 3923b1e4
      Pablo Neira Ayuso 提交于
      clean_net() runs in workqueue while walking over the lists, grab mutex.
      
      Fixes: 767d1216 ("netfilter: nftables: fix possible UAF over chains from packet path in netns")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3923b1e4
    • P
      netfilter: nf_tables: sanitize nft_set_desc_concat_parse() · fecf31ee
      Pablo Neira Ayuso 提交于
      Add several sanity checks for nft_set_desc_concat_parse():
      
      - validate desc->field_count not larger than desc->field_len array.
      - field length cannot be larger than desc->field_len (ie. U8_MAX)
      - total length of the concatenation cannot be larger than register array.
      
      Joint work with Florian Westphal.
      
      Fixes: f3a2181e ("netfilter: nf_tables: Support for sets with multiple ranged fields")
      Reported-by: <zhangziming.zzm@antgroup.com>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      fecf31ee
  2. 31 5月, 2022 5 次提交
  3. 29 5月, 2022 4 次提交
    • D
      Merge branch 'sfc-fixes' · 90343f57
      David S. Miller 提交于
      Íñigo Huguet says:
      
      ====================
      sfc: fix some efx_separate_tx_channels errors
      
      Trying to load sfc driver with modparam efx_separate_tx_channels=1
      resulted in errors during initialization and not being able to use the
      NIC. This patches fix a few bugs and make it work again.
      
      v2:
      * added Martin's patch instead of a previous mine. Mine one solved some
      of the initialization errors, but Martin's solves them also in all
      possible cases.
      * removed whitespaces cleanup, as requested by Jakub
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90343f57
    • Í
      sfc: fix wrong tx channel offset with efx_separate_tx_channels · c308dfd1
      Íñigo Huguet 提交于
      tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
      also calculated again in efx_set_channels because it was originally done
      there, and when efx_allocate_msix_channels was introduced it was
      forgotten to be removed from efx_set_channels.
      
      Moreover, the old calculation is wrong when using
      efx_separate_tx_channels because now we can have XDP channels after the
      TX channels, so n_channels - n_tx_channels doesn't point to the first TX
      channel.
      
      Remove the old calculation from efx_set_channels, and add the
      initialization of this variable if MSI or legacy interrupts are used,
      next to the initialization of the rest of the related variables, where
      it was missing.
      
      Fixes: 3990a8ff ("sfc: allocate channels for XDP tx queues")
      Reported-by: NTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c308dfd1
    • M
      sfc: fix considering that all channels have TX queues · 2e102b53
      Martin Habets 提交于
      Normally, all channels have RX and TX queues, but this is not true if
      modparam efx_separate_tx_channels=1 is used. In that cases, some
      channels only have RX queues and others only TX queues (or more
      preciselly, they have them allocated, but not initialized).
      
      Fix efx_channel_has_tx_queues to return the correct value for this case
      too.
      
      Messages shown at probe time before the fix:
       sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
       ------------[ cut here ]------------
       netdevice: ens6f0np0: failed to initialise TXQ -1
       WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       Call Trace:
        efx_init_tx_queue+0xaa/0xf0 [sfc]
        efx_start_channels+0x49/0x120 [sfc]
        efx_start_all+0x1f8/0x430 [sfc]
        efx_net_open+0x5a/0xe0 [sfc]
        __dev_open+0xd0/0x190
        __dev_change_flags+0x1b3/0x220
        dev_change_flags+0x21/0x60
       [...] stripped
      
      Messages shown at remove time before the fix:
       sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
       sfc 0000:03:00.0 ens6f0np0: failed to flush queues
      
      Fixes: 8700aff0 ("sfc: fix channel allocation with brute force")
      Reported-by: NTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: NMartin Habets <habetsm.xilinx@gmail.com>
      Tested-by: NÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e102b53
    • C
      net: enetc: Use pci_release_region() to release some resources · 18eeb4de
      Christophe JAILLET 提交于
      Some resources are allocated using pci_request_region().
      It is more straightforward to release them with pci_release_region().
      
      Fixes: 231ece36 ("enetc: Add mdio bus driver for the PCIe MDIO endpoint")
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18eeb4de
  4. 28 5月, 2022 13 次提交
    • H
      bonding: NS target should accept link local address · 5e1eeef6
      Hangbin Liu 提交于
      When setting bond NS target, we use bond_is_ip6_target_ok() to check
      if the address valid. The link local address was wrongly rejected in
      bond_changelink(), as most time the user just set the ARP/NS target to
      gateway, while the IPv6 gateway is always a link local address when user
      set up interface via SLAAC.
      
      So remove the link local addr check when setting bond NS target.
      
      Fixes: 129e3c1b ("bonding: add new option ns_ip6_target")
      Reported-by: NLi Liang <liali@redhat.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: NJonathan Toppins <jtoppins@redhat.com>
      Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e1eeef6
    • K
      net: nfc: Directly use ida_alloc()/free() · 91179917
      keliu 提交于
      Use ida_alloc()/ida_free() instead of deprecated
      ida_simple_get()/ida_simple_remove() .
      Signed-off-by: Nkeliu <liuke94@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91179917
    • Y
      nfp: only report pause frame configuration for physical device · 0649e4d6
      Yu Xiao 提交于
      Only report pause frame configuration for physical device. Logical
      port of both PCI PF and PCI VF do not support it.
      
      Fixes: 9fdc5d85 ("nfp: update ethtool reporting of pauseframe control")
      Signed-off-by: NYu Xiao <yu.xiao@corigine.com>
      Signed-off-by: NSimon Horman <simon.horman@corigine.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0649e4d6
    • S
      net: dpaa: Convert to SPDX identifiers · d8064c10
      Sean Anderson 提交于
      This converts these files to use SPDX idenfifiers instead of license
      text.
      Signed-off-by: NSean Anderson <sean.anderson@seco.com>
      Reviewed-by: NMadalin Bucur <madalin.bucur@oss.nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8064c10
    • E
      tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd · 11825765
      Eric Dumazet 提交于
      syzbot got a new report [1] finally pointing to a very old bug,
      added in initial support for MTU probing.
      
      tcp_mtu_probe() has checks about starting an MTU probe if
      tcp_snd_cwnd(tp) >= 11.
      
      But nothing prevents tcp_snd_cwnd(tp) to be reduced later
      and before the MTU probe succeeds.
      
      This bug would lead to potential zero-divides.
      
      Debugging added in commit 40570375 ("tcp: add accessors
      to read/set tp->snd_cwnd") has paid off :)
      
      While we are at it, address potential overflows in this code.
      
      [1]
      WARNING: CPU: 1 PID: 14132 at include/net/tcp.h:1219 tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712
      Modules linked in:
      CPU: 1 PID: 14132 Comm: syz-executor.2 Not tainted 5.18.0-syzkaller-07857-gbabf0bb9 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:tcp_snd_cwnd_set include/net/tcp.h:1219 [inline]
      RIP: 0010:tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712
      Code: 74 08 48 89 ef e8 da 80 17 f9 48 8b 45 00 65 48 ff 80 80 03 00 00 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 aa b0 c5 f8 <0f> 0b e9 16 fe ff ff 48 8b 4c 24 08 80 e1 07 38 c1 0f 8c c7 fc ff
      RSP: 0018:ffffc900079e70f8 EFLAGS: 00010287
      RAX: ffffffff88c0f7f6 RBX: ffff8880756e7a80 RCX: 0000000000040000
      RDX: ffffc9000c6c4000 RSI: 0000000000031f9e RDI: 0000000000031f9f
      RBP: 0000000000000000 R08: ffffffff88c0f606 R09: ffffc900079e7520
      R10: ffffed101011226d R11: 1ffff1101011226c R12: 1ffff1100eadcf50
      R13: ffff8880756e72c0 R14: 1ffff1100eadcf89 R15: dffffc0000000000
      FS:  00007f643236e700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f1ab3f1e2a0 CR3: 0000000064fe7000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       tcp_clean_rtx_queue+0x223a/0x2da0 net/ipv4/tcp_input.c:3356
       tcp_ack+0x1962/0x3c90 net/ipv4/tcp_input.c:3861
       tcp_rcv_established+0x7c8/0x1ac0 net/ipv4/tcp_input.c:5973
       tcp_v6_do_rcv+0x57b/0x1210 net/ipv6/tcp_ipv6.c:1476
       sk_backlog_rcv include/net/sock.h:1061 [inline]
       __release_sock+0x1d8/0x4c0 net/core/sock.c:2849
       release_sock+0x5d/0x1c0 net/core/sock.c:3404
       sk_stream_wait_memory+0x700/0xdc0 net/core/stream.c:145
       tcp_sendmsg_locked+0x111d/0x3fc0 net/ipv4/tcp.c:1410
       tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1448
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg net/socket.c:734 [inline]
       __sys_sendto+0x439/0x5c0 net/socket.c:2119
       __do_sys_sendto net/socket.c:2131 [inline]
       __se_sys_sendto net/socket.c:2127 [inline]
       __x64_sys_sendto+0xda/0xf0 net/socket.c:2127
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      RIP: 0033:0x7f6431289109
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f643236e168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007f643139c100 RCX: 00007f6431289109
      RDX: 00000000d0d0c2ac RSI: 0000000020000080 RDI: 000000000000000a
      RBP: 00007f64312e308d R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007fff372533af R14: 00007f643236e300 R15: 0000000000022000
      
      Fixes: 5d424d5a ("[TCP]: MTU probing")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11825765
    • K
      net: phy: Directly use ida_alloc()/free() · 2f1de254
      Ke Liu 提交于
      Use ida_alloc()/ida_free() instead of deprecated
      ida_simple_get()/ida_simple_remove().
      Signed-off-by: NKe Liu <liuke94@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f1de254
    • G
      net/smc: fixes for converting from "struct smc_cdc_tx_pend **" to "struct smc_wr_tx_pend_priv *" · e225c9a5
      Guangguan Wang 提交于
      "struct smc_cdc_tx_pend **" can not directly convert
      to "struct smc_wr_tx_pend_priv *".
      
      Fixes: 2bced6ae ("net/smc: put slot when connection is killed")
      Signed-off-by: NGuangguan Wang <guangguan.wang@linux.alibaba.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e225c9a5
    • J
      Merge branch 'net-ipa-fix-page-free-in-two-spots' · 9bae058a
      Jakub Kicinski 提交于
      Alex Elder says:
      
      ====================
      net: ipa: fix page free in two spots
      
      When a receive buffer is not wrapped in an SKB and passed to the
      network stack, the (compound) page gets freed within the IPA driver.
      This is currently quite rare.
      
      The pages are freed using __free_pages(), but they should instead be
      freed using page_put().  This series fixes this, in two spots.
      
      These patches work for the current linus/master branch, but won't
      apply cleanly to earlier stable branches.  (Nevertheless, the fix is
      a trivial substitution everwhere __free_pages() is called.)
      ====================
      
      Link: https://lore.kernel.org/r/20220526152314.1405629-1-elder@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      9bae058a
    • A
      net: ipa: fix page free in ipa_endpoint_replenish_one() · 70132763
      Alex Elder 提交于
      Currently the (possibly compound) pages used for receive buffers are
      freed using __free_pages().  But according to this comment above the
      definition of that function, that's wrong:
          If you want to use the page's reference count to decide
          when to free the allocation, you should allocate a compound
          page, and use put_page() instead of __free_pages().
      
      Convert the call to __free_pages() in ipa_endpoint_replenish_one()
      to use put_page() instead.
      
      Fixes: 6a606b90 ("net: ipa: allocate transaction in replenish loop")
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      70132763
    • A
      net: ipa: fix page free in ipa_endpoint_trans_release() · 155c0c90
      Alex Elder 提交于
      Currently the (possibly compound) page used for receive buffers are
      freed using __free_pages().  But according to this comment above the
      definition of that function, that's wrong:
          If you want to use the page's reference count to decide when
          to free the allocation, you should allocate a compound page,
          and use put_page() instead of __free_pages().
      
      Convert the call to __free_pages() in ipa_endpoint_trans_release()
      to use put_page() instead.
      
      Fixes: ed23f026 ("net: ipa: define per-endpoint receive buffer size")
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      155c0c90
    • A
    • J
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 6b51935a
      Jakub Kicinski 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2022-05-28
      
      We've added 2 non-merge commits during the last 1 day(s) which contain
      a total of 2 files changed, 6 insertions(+), 10 deletions(-).
      
      The main changes are:
      
      1) Fix ldx_probe_mem instruction in interpreter by properly zero-extending
         the bpf_probe_read_kernel() read content, from Menglong Dong.
      
      2) Fix stacktrace_build_id BPF selftest given urandom_read has been renamed
         into urandom_read_iter in random driver, from Song Liu.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Fix probe read error in ___bpf_prog_run()
        selftests/bpf: fix stacktrace_build_id with missing kprobe/urandom_read
      ====================
      
      Link: https://lore.kernel.org/r/20220527235042.8526-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      6b51935a
    • M
      bpf: Fix probe read error in ___bpf_prog_run() · caff1fa4
      Menglong Dong 提交于
      I think there is something wrong with BPF_PROBE_MEM in ___bpf_prog_run()
      in big-endian machine. Let's make a test and see what will happen if we
      want to load a 'u16' with BPF_PROBE_MEM.
      
      Let's make the src value '0x0001', the value of dest register will become
      0x0001000000000000, as the value will be loaded to the first 2 byte of
      DST with following code:
      
        bpf_probe_read_kernel(&DST, SIZE, (const void *)(long) (SRC + insn->off));
      
      Obviously, the value in DST is not correct. In fact, we can compare
      BPF_PROBE_MEM with LDX_MEM_H:
      
        DST = *(SIZE *)(unsigned long) (SRC + insn->off);
      
      If the memory load is done by LDX_MEM_H, the value in DST will be 0x1 now.
      
      And I think this error results in the test case 'test_bpf_sk_storage_map'
      failing:
      
        test_bpf_sk_storage_map:PASS:bpf_iter_bpf_sk_storage_map__open_and_load 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:attach_iter 0 nsec
        test_bpf_sk_storage_map:PASS:create_iter 0 nsec
        test_bpf_sk_storage_map:PASS:read 0 nsec
        test_bpf_sk_storage_map:FAIL:ipv6_sk_count got 0 expected 3
        $10/26 bpf_iter/bpf_sk_storage_map:FAIL
      
      The code of the test case is simply, it will load sk->sk_family to the
      register with BPF_PROBE_MEM and check if it is AF_INET6. With this patch,
      now the test case 'bpf_iter' can pass:
      
        $10  bpf_iter:OK
      
      Fixes: 2a02759e ("bpf: Add support for BTF pointers to interpreter")
      Signed-off-by: NMenglong Dong <imagedong@tencent.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NJiang Biao <benbjiang@tencent.com>
      Reviewed-by: NHao Peng <flyingpeng@tencent.com>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Link: https://lore.kernel.org/bpf/20220524021228.533216-1-imagedong@tencent.com
      caff1fa4
  5. 27 5月, 2022 9 次提交