提交 · beb27ca451a57a1c0e52b5268703f3c3173c1f8c · openeuler / Kernel

10 11月, 2021 7 次提交

net: hns3: fix ROCE base interrupt vector initialization bug · beb27ca4

由 Jie Wang 提交于 11月 10, 2021

Currently, NIC init ROCE interrupt vector with MSIX interrupt. But ROCE use
pci_irq_vector() to get interrupt vector, which adds the relative interrupt
vector again and gets wrong interrupt vector.

So fixes it by assign relative interrupt vector to ROCE instead of MSIX
interrupt vector and delete the unused struct member base_msi_vector
declaration of hclgevf_dev.

Fixes: 46a3df9f ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: NJie Wang <wangjie125@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

beb27ca4

net: hns3: fix failed to add reuse multicast mac addr to hardware when mc mac table is full · 3b4c6566

由 Guangbin Huang 提交于 11月 10, 2021

Currently, when driver is failed to add a new multicast mac address to
hardware due to the multicast mac table is full, it will directly return.
In this case, if the multicast mac list has some reuse addresses after the
new address, those reuse addresses will never be added to hardware.

To fix this problem, if function hclge_add_mc_addr_common() returns
-ENOSPC, hclge_sync_vport_mac_list() should judge whether continue or
stop to add next address.

As function hclge_sync_vport_mac_list() needs parameter mac_type to know
whether is uc or mc, refine this function to add parameter mac_type and
remove parameter sync. So does function hclge_unsync_vport_mac_list().

Fixes: ee4bcd3b ("net: hns3: refactor the MAC address configure")
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b4c6566

net: mana: Fix spelling mistake "calledd" -> "called" · 8f1bc38b

由 Colin Ian King 提交于 11月 08, 2021

There is a spelling mistake in a dev_info message. Fix it.
Signed-off-by: NColin Ian King <colin.i.king@gmail.com>
Reviewed-by: NDexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/20211108201817.43121-1-colin.i.king@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8f1bc38b

net/sched: sch_taprio: fix undefined behavior in ktime_mono_to_any · 6dc25401

由 Eric Dumazet 提交于 11月 08, 2021

1) if q->tk_offset == TK_OFFS_MAX, then get_tcp_tstamp() calls
   ktime_mono_to_any() with out-of-bound value.

2) if q->tk_offset is changed in taprio_parse_clockid(),
   taprio_get_time() might also call ktime_mono_to_any()
   with out-of-bound value as sysbot found:

UBSAN: array-index-out-of-bounds in kernel/time/timekeeping.c:908:27
index 3 is out of range for type 'ktime_t *[3]'
CPU: 1 PID: 25668 Comm: kworker/u4:0 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
 __ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
 ktime_mono_to_any+0x1d4/0x1e0 kernel/time/timekeeping.c:908
 get_tcp_tstamp net/sched/sch_taprio.c:322 [inline]
 get_packet_txtime net/sched/sch_taprio.c:353 [inline]
 taprio_enqueue_one+0x5b0/0x1460 net/sched/sch_taprio.c:420
 taprio_enqueue+0x3b1/0x730 net/sched/sch_taprio.c:485
 dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3785
 __dev_xmit_skb net/core/dev.c:3869 [inline]
 __dev_queue_xmit+0x1f6e/0x3630 net/core/dev.c:4194
 batadv_send_skb_packet+0x4a9/0x5f0 net/batman-adv/send.c:108
 batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:393 [inline]
 batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:421 [inline]
 batadv_iv_send_outstanding_bat_ogm_packet+0x6d7/0x8e0 net/batman-adv/bat_iv_ogm.c:1701
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
 kthread+0x405/0x4f0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Fixes: 7ede7b03 ("taprio: make clock reference conversions easier")
Fixes: 54002066 ("taprio: Adjust timestamps for TCP packets")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Vedang Patel <vedang.patel@intel.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Reviewed-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20211108180815.1822479-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6dc25401

amt: use cancel_delayed_work() instead of flush_delayed_work() in amt_fini() · 43aa4937

由 Taehee Yoo 提交于 11月 08, 2021

When the amt module is being removed, it calls flush_delayed_work() to exit
source_gc_wq. But it wouldn't be exited properly because the
amt_source_gc_work(), which is the callback function of source_gc_wq
internally calls mod_delayed_work() again.
So, amt_source_gc_work() would be called after the amt module is removed.
Therefore kernel panic would occur.
In order to avoid it, cancel_delayed_work() should be used instead of
flush_delayed_work().

Test commands:
   modprobe amt
   modprobe -rv amt

Splat looks like:
 BUG: unable to handle page fault for address: fffffbfff80f50db
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 1237ee067 P4D 1237ee067 PUD 1237b2067 PMD 100c11067 PTE 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0+ #27
 5a0ebebc29fe5c40c68bea90197606c3a832b09f
 RIP: 0010:run_timer_softirq+0x221/0xfc0
 Code: 00 00 4c 89 e1 4c 8b 30 48 c1 e9 03 80 3c 29 00 0f 85 ed 0b 00 00
 4d 89 34 24 4d 85 f6 74 19 49 8d 7e 08 48 89 f9 48 c1 e9 03 <80> 3c 29 00
 0f 85 fa 0b 00 00 4d 89 66 08 83 04 24 01 49 89 d4 48
 RSP: 0018:ffff888119009e50 EFLAGS: 00010806
 RAX: ffff8881191f8a80 RBX: 00000000007ffe2a RCX: 1ffffffff80f50db
 RDX: ffff888119009ed0 RSI: 0000000000000008 RDI: ffffffffc07a86d8
 RBP: dffffc0000000000 R08: ffff8881191f8280 R09: ffffed102323f061
 R10: ffff8881191f8307 R11: ffffed102323f060 R12: ffff888119009ec8
 R13: 00000000000000c0 R14: ffffffffc07a86d0 R15: ffff8881191f82e8
 FS:  0000000000000000(0000) GS:ffff888119000000(0000)
 knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: fffffbfff80f50db CR3: 00000001062dc002 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  ? add_timer+0x650/0x650
  ? kvm_clock_read+0x14/0x30
  ? ktime_get+0xb9/0x180
  ? rcu_read_lock_held_common+0xe/0xa0
  ? rcu_read_lock_sched_held+0x56/0xc0
  ? rcu_read_lock_bh_held+0xa0/0xa0
  ? hrtimer_interrupt+0x271/0x790
  __do_softirq+0x1d0/0x88f
  irq_exit_rcu+0xe7/0x120
  sysvec_apic_timer_interrupt+0x8a/0xb0
  </IRQ>
  <TASK>
[ ... ]

Fixes: bc54e49c ("amt: add multicast(IGMP) report message handler")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20211108145340.17208-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

43aa4937

net: dsa: mv88e6xxx: Don't support >1G speeds on 6191X on ports other than 10 · dc2fc9f0

由 Marek Behún 提交于 11月 04, 2021

Model 88E6191X only supports >1G speeds on port 10. Port 0 and 9 are
only 1G.

Fixes: de776d0d ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: NMarek Behún <kabel@kernel.org>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211104171747.10509-1-kabel@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

dc2fc9f0

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · fceb0795

由 Jakub Kicinski 提交于 11月 09, 2021

Alexei Starovoitov says:

====================
pull-request: bpf 2021-11-09

We've added 7 non-merge commits during the last 3 day(s) which contain
a total of 10 files changed, 174 insertions(+), 48 deletions(-).

The main changes are:

1) Various sockmap fixes, from John and Jussi.

2) Fix out-of-bound issue with bpf_pseudo_func, from Martin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
  bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
  bpf, sockmap: Fix race in ingress receive verdict with redirect to self
  bpf, sockmap: Remove unhash handler for BPF sockmap usage
  bpf, sockmap: Use stricter sk state checks in sk_lookup_assign
  bpf: selftest: Trigger a DCE on the whole subprog
  bpf: Stop caching subprog index in the bpf_pseudo_func insn
====================

Link: https://lore.kernel.org/r/20211109215702.38350-1-alexei.starovoitov@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

fceb0795

09 11月, 2021 10 次提交

amt: add IPV6 Kconfig dependency · 9758aba8

由 Arnd Bergmann 提交于 11月 08, 2021

This driver cannot be built-in if IPV6 is a loadable module:

x86_64-linux-ld: drivers/net/amt.o: in function `amt_build_mld_gq':
amt.c:(.text+0x2e7d): undefined reference to `ipv6_dev_get_saddr'

Add the idiomatic Kconfig dependency that all such modules
have.

Fixes: b9022b53 ("amt: add control plane of amt interface")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9758aba8

gve: Fix off by one in gve_tx_timeout() · 1c360cc1

由 Dan Carpenter 提交于 11月 09, 2021

The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this >
needs to be >= to prevent an off by one bug.  The priv->ntfy_blocks[]
array is allocated in gve_alloc_notify_blocks().

Fixes: 87a7f321 ("gve: Recover from queue stall due to missed IRQ")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c360cc1

hamradio: defer 6pack kfree after unregister_netdev · 0b911192

由 Lin Ma 提交于 11月 08, 2021

There is a possible race condition (use-after-free) like below

 (USE)                       |  (FREE)
  dev_queue_xmit             |
   __dev_queue_xmit          |
    __dev_xmit_skb           |
     sch_direct_xmit         | ...
      xmit_one               |
       netdev_start_xmit     | tty_ldisc_kill
        __netdev_start_xmit  |  6pack_close
         sp_xmit             |   kfree
          sp_encaps          |
                             |

According to the patch "defer ax25 kfree after unregister_netdev", this
patch reorder the kfree after the unregister_netdev to avoid the possible
UAF as the unregister_netdev() is well synchronized and won't return if
there is a running routine.
Signed-off-by: NLin Ma <linma@zju.edu.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b911192

hamradio: defer ax25 kfree after unregister_netdev · 3e0588c2

由 Lin Ma 提交于 11月 08, 2021

There is a possible race condition (use-after-free) like below

 (USE)                       |  (FREE)
ax25_sendmsg                 |
 ax25_queue_xmit             |
  dev_queue_xmit             |
   __dev_queue_xmit          |
    __dev_xmit_skb           |
     sch_direct_xmit         | ...
      xmit_one               |
       netdev_start_xmit     | tty_ldisc_kill
        __netdev_start_xmit  |  mkiss_close
         ax_xmit             |   kfree
          ax_encaps          |
                             |

Even though there are two synchronization primitives before the kfree:
1. wait_for_completion(&ax->dead). This can prevent the race with
routines from mkiss_ioctl. However, it cannot stop the routine coming
from upper layer, i.e., the ax25_sendmsg.

2. netif_stop_queue(ax->dev). It seems that this line of code aims to
halt the transmit queue but it fails to stop the routine that already
being xmit.

This patch reorder the kfree after the unregister_netdev to avoid the
possible UAF as the unregister_netdev() is well synchronized and won't
return if there is a running routine.
Signed-off-by: NLin Ma <linma@zju.edu.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e0588c2

net: sungem_phy: fix code indentation · 54f0bad6

由 Jean Sacren 提交于 11月 07, 2021

Remove extra space in front of the return statement.

Fixes: eb5b5b2f ("sungem_phy: support bcm5461 phy, autoneg.")
Signed-off-by: NJean Sacren <sakiwit@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54f0bad6

bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg · b2c46181

由 Jussi Maki 提交于 11月 03, 2021

The current conversion of skb->data_end reads like this:

  ; data_end = (void*)(long)skb->data_end;
   559: (79) r1 = *(u64 *)(r2 +200)   ; r1  = skb->data
   560: (61) r11 = *(u32 *)(r2 +112)  ; r11 = skb->len
   561: (0f) r1 += r11
   562: (61) r11 = *(u32 *)(r2 +116)
   563: (1f) r1 -= r11

But similar to the case in 84f44df6 ("bpf: sock_ops sk access may stomp
registers when dst_reg = src_reg"), the code will read an incorrect skb->len
when src == dst. In this case we end up generating this xlated code:

  ; data_end = (void*)(long)skb->data_end;
   559: (79) r1 = *(u64 *)(r1 +200)   ; r1  = skb->data
   560: (61) r11 = *(u32 *)(r1 +112)  ; r11 = (skb->data)->len
   561: (0f) r1 += r11
   562: (61) r11 = *(u32 *)(r1 +116)
   563: (1f) r1 -= r11

... where line 560 is the reading 4B of (skb->data + 112) instead of the
intended skb->len Here the skb pointer in r1 gets set to skb->data and the
later deref for skb->len ends up following skb->data instead of skb.

This fixes the issue similarly to the patch mentioned above by creating an
additional temporary variable and using to store the register when dst_reg =
src_reg. We name the variable bpf_temp_reg and place it in the cb context for
sk_skb. Then we restore from the temp to ensure nothing is lost.

Fixes: 16137b09 ("bpf: Compute data_end dynamically with JIT code")
Signed-off-by: NJussi Maki <joamaki@gmail.com>
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-6-john.fastabend@gmail.com

b2c46181

bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding · e0dc3b93

由 John Fastabend 提交于 11月 03, 2021

Strparser is reusing the qdisc_skb_cb struct to stash the skb message handling
progress, e.g. offset and length of the skb. First this is poorly named and
inherits a struct from qdisc that doesn't reflect the actual usage of cb[] at
this layer.

But, more importantly strparser is using the following to access its metadata.

  (struct _strp_msg *)((void *)skb->cb + offsetof(struct qdisc_skb_cb, data))

Where _strp_msg is defined as:

  struct _strp_msg {
        struct strp_msg            strp;                 /*     0     8 */
        int                        accum_len;            /*     8     4 */

        /* size: 12, cachelines: 1, members: 2 */
        /* last cacheline: 12 bytes */
  };

So we use 12 bytes of ->data[] in struct. However in BPF code running parser
and verdict the user has read capabilities into the data[] array as well. Its
not too problematic, but we should not be exposing internal state to BPF
program. If its really needed then we can use the probe_read() APIs which allow
reading kernel memory. And I don't believe cb[] layer poses any API breakage by
moving this around because programs can't depend on cb[] across layers.

In order to fix another issue with a ctx rewrite we need to stash a temp
variable somewhere. To make this work cleanly this patch builds a cb struct
for sk_skb types called sk_skb_cb struct. Then we can use this consistently
in the strparser, sockmap space. Additionally we can start allowing ->cb[]
write access after this.

Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NJussi Maki <joamaki@gmail.com>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-5-john.fastabend@gmail.com

e0dc3b93

bpf, sockmap: Fix race in ingress receive verdict with redirect to self · c5d2177a

由 John Fastabend 提交于 11月 03, 2021

A socket in a sockmap may have different combinations of programs attached
depending on configuration. There can be no programs in which case the socket
acts as a sink only. There can be a TX program in this case a BPF program is
attached to sending side, but no RX program is attached. There can be an RX
program only where sends have no BPF program attached, but receives are hooked
with BPF. And finally, both TX and RX programs may be attached. Giving us the
permutations:

 None, Tx, Rx, and TxRx

To date most of our use cases have been TX case being used as a fast datapath
to directly copy between local application and a userspace proxy. Or Rx cases
and TxRX applications that are operating an in kernel based proxy. The traffic
in the first case where we hook applications into a userspace application looks
like this:

  AppA  redirect   AppB
   Tx <-----------> Rx
   |                |
   +                +
   TCP <--> lo <--> TCP

In this case all traffic from AppA (after 3whs) is copied into the AppB
ingress queue and no traffic is ever on the TCP recieive_queue.

In the second case the application never receives, except in some rare error
cases, traffic on the actual user space socket. Instead the send happens in
the kernel.

           AppProxy       socket pool
       sk0 ------------->{sk1,sk2, skn}
        ^                      |
        |                      |
        |                      v
       ingress              lb egress
       TCP                  TCP

Here because traffic is never read off the socket with userspace recv() APIs
there is only ever one reader on the sk receive_queue. Namely the BPF programs.

However, we've started to introduce a third configuration where the BPF program
on receive should process the data, but then the normal case is to push the
data into the receive queue of AppB.

       AppB
       recv()                (userspace)
     -----------------------
       tcp_bpf_recvmsg()     (kernel)
         |             |
         |             |
         |             |
       ingress_msgQ    |
         |             |
       RX_BPF          |
         |             |
         v             v
       sk->receive_queue

This is different from the App{A,B} redirect because traffic is first received
on the sk->receive_queue.

Now for the issue. The tcp_bpf_recvmsg() handler first checks the ingress_msg
queue for any data handled by the BPF rx program and returned with PASS code
so that it was enqueued on the ingress msg queue. Then if no data exists on
that queue it checks the socket receive queue. Unfortunately, this is the same
receive_queue the BPF program is reading data off of. So we get a race. Its
possible for the recvmsg() hook to pull data off the receive_queue before the
BPF hook has a chance to read it. It typically happens when an application is
banging on recv() and getting EAGAINs. Until they manage to race with the RX
BPF program.

To fix this we note that before this patch at attach time when the socket is
loaded into the map we check if it needs a TX program or just the base set of
proto bpf hooks. Then it uses the above general RX hook regardless of if we
have a BPF program attached at rx or not. This patch now extends this check to
handle all cases enumerated above, TX, RX, TXRX, and none. And to fix above
race when an RX program is attached we use a new hook that is nearly identical
to the old one except now we do not let the recv() call skip the RX BPF program.
Now only the BPF program pulls data from sk->receive_queue and recv() only
pulls data from the ingress msgQ post BPF program handling.

With this resolved our AppB from above has been up and running for many hours
without detecting any errors. We do this by correlating counters in RX BPF
events and the AppB to ensure data is never skipping the BPF program. Selftests,
was not able to detect this because we only run them for a short period of time
on well ordered send/recvs so we don't get any of the noise we see in real
application environments.

Fixes: 51199405 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NJussi Maki <joamaki@gmail.com>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-4-john.fastabend@gmail.com

c5d2177a

bpf, sockmap: Remove unhash handler for BPF sockmap usage · b8b8315e

由 John Fastabend 提交于 11月 03, 2021

We do not need to handle unhash from BPF side we can simply wait for the
close to happen. The original concern was a socket could transition from
ESTABLISHED state to a new state while the BPF hook was still attached.
But, we convinced ourself this is no longer possible and we also improved
BPF sockmap to handle listen sockets so this is no longer a problem.

More importantly though there are cases where unhash is called when data is
in the receive queue. The BPF unhash logic will flush this data which is
wrong. To be correct it should keep the data in the receive queue and allow
a receiving application to continue reading the data. This may happen when
tcp_abort() is received for example. Instead of complicating the logic in
unhash simply moving all this to tcp_close() hook solves this.

Fixes: 51199405 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Tested-by: NJussi Maki <joamaki@gmail.com>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-3-john.fastabend@gmail.com

b8b8315e

bpf, sockmap: Use stricter sk state checks in sk_lookup_assign · 40a34121

由 John Fastabend 提交于 11月 03, 2021

In order to fix an issue with sockets in TCP sockmap redirect cases we plan
to allow CLOSE state sockets to exist in the sockmap. However, the check in
bpf_sk_lookup_assign() currently only invalidates sockets in the
TCP_ESTABLISHED case relying on the checks on sockmap insert to ensure we
never SOCK_CLOSE state sockets in the map.

To prepare for this change we flip the logic in bpf_sk_lookup_assign() to
explicitly test for the accepted cases. Namely, a tcp socket in TCP_LISTEN
or a udp socket in TCP_CLOSE state. This also makes the code more resilent
to future changes.
Suggested-by: NJakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-2-john.fastabend@gmail.com

40a34121

08 11月, 2021 9 次提交

litex_liteeth: Fix a double free in the remove function · c45231a7

由 Christophe JAILLET 提交于 11月 07, 2021

'netdev' is a managed resource allocated in the probe using
'devm_alloc_etherdev()'.
It must not be freed explicitly in the remove function.

Fixes: ee7da21a ("net: Add driver for LiteX's LiteETH network interface")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c45231a7

nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails · 9fec40f8

由 Chengfeng Ye 提交于 11月 05, 2021

skb is already freed by dev_kfree_skb in pn533_fill_fragment_skbs,
but follow error handler branch when pn533_fill_fragment_skbs()
fails, skb is freed again, results in double free issue. Fix this
by not free skb in error path of pn533_fill_fragment_skbs.

Fixes: 963a82e0 ("NFC: pn533: Split large Tx frames in chunks")
Fixes: 93ad4202 ("NFC: pn533: Target mode Tx fragmentation support")
Signed-off-by: NChengfeng Ye <cyeaa@connect.ust.hk>
Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9fec40f8

selftests: net: tls: remove unused variable and code · 62b12ab5

由 Anders Roxell 提交于 11月 05, 2021

When building selftests/net with clang, the compiler warn about the
function abs() see below:

tls.c:657:15: warning: variable 'len_compared' set but not used [-Wunused-but-set-variable]
        unsigned int len_compared = 0;
                     ^

Rework to remove the unused variable and the for-loop where the variable
'len_compared' was assinged.

Fixes: 7f657d5b ("selftests: tls: add selftests for TLS sockets")
Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62b12ab5

net: marvell: prestera: fix hw structure laid out · e1464db5

由 Volodymyr Mytnyk 提交于 11月 05, 2021

The prestera FW v4.0 support commit has been merged
accidentally w/o review comments addressed and waiting
for the final patch set to be uploaded. So, fix the remaining
comments related to structure laid out and build issues.
Reported-by: Nkernel test robot <lkp@intel.com>
Fixes: bb5dbf2c ("net: marvell: prestera: add firmware v4.0 support")
Signed-off-by: NVolodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1464db5

sctp: remove unreachable code from sctp_sf_violation_chunk() · e7ea51cd

由 Alexey Khoroshilov 提交于 11月 05, 2021

sctp_sf_violation_chunk() is not called with asoc argument equal to NULL,
but if that happens it would lead to NULL pointer dereference
in sctp_vtag_verify().

The patch removes code that handles NULL asoc in sctp_sf_violation_chunk().

Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: NAlexey Khoroshilov <khoroshilov@ispras.ru>
Proposed-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7ea51cd

llc: fix out-of-bound array index in llc_sk_dev_hash() · 8ac9dfd5

由 Eric Dumazet 提交于 11月 05, 2021

Both ifindex and LLC_SK_DEV_HASH_ENTRIES are signed.

This means that (ifindex % LLC_SK_DEV_HASH_ENTRIES) is negative
if @ifindex is negative.

We could simply make LLC_SK_DEV_HASH_ENTRIES unsigned.

In this patch I chose to use hash_32() to get more entropy
from @ifindex, like llc_sk_laddr_hashfn().

UBSAN: array-index-out-of-bounds in ./include/net/llc.h:75:26
index -43 is out of range for type 'hlist_head [64]'
CPU: 1 PID: 20999 Comm: syz-executor.3 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
 __ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
 llc_sk_dev_hash include/net/llc.h:75 [inline]
 llc_sap_add_socket+0x49c/0x520 net/llc/llc_conn.c:697
 llc_ui_bind+0x680/0xd70 net/llc/af_llc.c:404
 __sys_bind+0x1e9/0x250 net/socket.c:1693
 __do_sys_bind net/socket.c:1704 [inline]
 __se_sys_bind net/socket.c:1702 [inline]
 __x64_sys_bind+0x6f/0xb0 net/socket.c:1702
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa503407ae9

Fixes: 6d2e3ea2 ("llc: use a device based hash table to speed up multicast delivery")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ac9dfd5

net: hisilicon: fix hsn3_ethtool kernel-doc warnings · 85879f13

由 Randy Dunlap 提交于 11月 05, 2021

Fix kernel-doc warnings and spacing in hns3_ethtool.c:

hns3_ethtool.c:246: warning: No description found for return value of 'hns3_lp_run_test'
hns3_ethtool.c:408: warning: expecting prototype for hns3_nic_self_test(). Prototype was for hns3_self_test() instead
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Reported-by: Nkernel test robot <lkp@intel.com>
Cc: Peng Li <lipeng321@huawei.com>
Cc: Guangbin Huang <huangguangbin2@huawei.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85879f13

nfc: port100: lower verbosity of cancelled URB messages · 08fcdfa6

由 Krzysztof Kozlowski 提交于 11月 07, 2021

It is not an error to receive an URB with -ENOENT because it can come
from regular user operations, e.g. pressing CTRL+C when running nfctool
from neard. Make it a debugging message, not an error.
Signed-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08fcdfa6

Merge tag 'linux-can-fixes-for-5.16-20211106' of... · f05fb508

由 David S. Miller 提交于 11月 07, 2021

Merge tag 'linux-can-fixes-for-5.16-20211106' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

linux-can-fixes-for-5.16-20211106

Marc Kleine-Budde says:

====================
pull-request: can 2021-11-06

this is a pull request of 8 patches for net/master.

The first 3 patches are by Zhang Changzhong and fix 3 standard
conformance problems in the j1939 CAN stack.

The next patch is by Vincent Mailhol and fixes a memory leak in the
leak error path of the etas_es58x CAN driver.

Stephane Grosjean contributes 2 patches for the peak_usb driver to fix
the bus error handling and update the order of printed information
regarding firmware version and available updates.

The last 2 patches are by me and fixes a packet starvation problem in
the bus off case and the error handling in the mcp251xfd_chip_start()
function.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f05fb508

07 11月, 2021 11 次提交

can: mcp251xfd: mcp251xfd_chip_start(): fix error handling for mcp251xfd_chip_rx_int_enable() · 69c55f6e

由 Marc Kleine-Budde 提交于 10月 19, 2021

This patch fixes the error handling for mcp251xfd_chip_rx_int_enable().
Instead just returning the error, properly shut down the chip.

Link: https://lore.kernel.org/all/20211106201526.44292-2-mkl@pengutronix.de
Fixes: 55e5b97f ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

69c55f6e

Merge branch 'bpf: Fix out-of-bound issue when jit-ing bpf_pseudo_func' · 47b3708c

由 Alexei Starovoitov 提交于 11月 06, 2021

Martin KaFai says:

====================

This set fixes an out-of-bound access issue when jit-ing the
bpf_pseudo_func insn (i.e. ld_imm64 with src_reg == BPF_PSEUDO_FUNC)
====================
Reported-by: NYonatan Komornik <yoniko@gmail.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

47b3708c

can: mcp251xfd: mcp251xfd_irq(): add missing... · 691204bd

由 Marc Kleine-Budde 提交于 10月 15, 2021

can: mcp251xfd: mcp251xfd_irq(): add missing can_rx_offload_threaded_irq_finish() in case of bus off

The function can_rx_offload_threaded_irq_finish() is needed to trigger
the NAPI thread to deliver read CAN frames to the networking stack.

This patch adds the missing call to can_rx_offload_threaded_irq_finish()
in case of a bus off, before leaving the interrupt handler to avoid
packet starvation.

Link: https://lore.kernel.org/all/20211106201526.44292-1-mkl@pengutronix.de
Fixes: 30bfec4f ("can: rx-offload: can_rx_offload_threaded_irq_finish(): add new function to be called from threaded interrupt")
Cc: stable@vger.kernel.org
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

691204bd

bpf: selftest: Trigger a DCE on the whole subprog · d99341b3

由 Martin KaFai Lau 提交于 11月 05, 2021

This patch adds a test to trigger the DCE to remove
the whole subprog to ensure the verifier  does not
depend on a stable subprog index.  The DCE is done
by testing a global const.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211106014020.651638-1-kafai@fb.com

d99341b3

bpf: Stop caching subprog index in the bpf_pseudo_func insn · 3990ed4c

由 Martin KaFai Lau 提交于 11月 05, 2021

This patch is to fix an out-of-bound access issue when jit-ing the
bpf_pseudo_func insn (i.e. ld_imm64 with src_reg == BPF_PSEUDO_FUNC)

In jit_subprog(), it currently reuses the subprog index cached in
insn[1].imm.  This subprog index is an index into a few array related
to subprogs.  For example, in jit_subprog(), it is an index to the newly
allocated 'struct bpf_prog **func' array.

The subprog index was cached in insn[1].imm after add_subprog().  However,
this could become outdated (and too big in this case) if some subprogs
are completely removed during dead code elimination (in
adjust_subprog_starts_after_remove).  The cached index in insn[1].imm
is not updated accordingly and causing out-of-bound issue in the later
jit_subprog().

Unlike bpf_pseudo_'func' insn, the current bpf_pseudo_'call' insn
is handling the DCE properly by calling find_subprog(insn->imm) to
figure out the index instead of caching the subprog index.
The existing bpf_adj_branches() will adjust the insn->imm
whenever insn is added or removed.

Instead of having two ways handling subprog index,
this patch is to make bpf_pseudo_func works more like
bpf_pseudo_call.

First change is to stop caching the subprog index result
in insn[1].imm after add_subprog().  The verification
process will use find_subprog(insn->imm) to figure
out the subprog index.

Second change is in bpf_adj_branches() and have it to
adjust the insn->imm for the bpf_pseudo_func insn also
whenever insn is added or removed.

Third change is in jit_subprog().  Like the bpf_pseudo_call handling,
bpf_pseudo_func temporarily stores the find_subprog() result
in insn->off.  It is fine because the prog's insn has been finalized
at this point.  insn->off will be reset back to 0 later to avoid
confusing the userspace prog dump tool.

Fixes: 69c087ba ("bpf: Add bpf_for_each_map_elem() helper")
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211106014014.651018-1-kafai@fb.com

3990ed4c

can: peak_usb: exchange the order of information messages · 6b78ba3e

由 Stephane Grosjean 提交于 10月 21, 2021

Proposes the possible update of the PCAN-USB firmware after indicating its
name and current version.

Link: https://lore.kernel.org/all/20211021081505.18223-3-s.grosjean@peak-system.comSigned-off-by: NStephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

6b78ba3e

can: peak_usb: always ask for BERR reporting for PCAN-USB devices · 3f1c7aa2

由 Stephane Grosjean 提交于 10月 21, 2021

Since for the PCAN-USB, the management of the transition to the
ERROR_WARNING or ERROR_PASSIVE state is done according to the error
counters, these must be requested unconditionally.

Link: https://lore.kernel.org/all/20211021081505.18223-2-s.grosjean@peak-system.com
Fixes: c11dcee7 ("can: peak_usb: pcan_usb_decode_error(): upgrade handling of bus state changes")
Cc: stable@vger.kernel.org
Signed-off-by: NStephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

3f1c7aa2

can: etas_es58x: es58x_rx_err_msg(): fix memory leak in error path · d9447f76

由 Vincent Mailhol 提交于 10月 27, 2021

In es58x_rx_err_msg(), if can->do_set_mode() fails, the function
directly returns without calling netif_rx(skb). This means that the
skb previously allocated by alloc_can_err_skb() is not freed. In other
terms, this is a memory leak.

This patch simply removes the return statement in the error branch and
let the function continue.

Issue was found with GCC -fanalyzer, please follow the link below for
details.

Fixes: 85372578 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/all/20211026180740.1953265-1-mailhol.vincent@wanadoo.frSigned-off-by: NVincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

d9447f76

can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM · 164051a6

由 Zhang Changzhong 提交于 10月 28, 2021

The TP.CM_BAM message must be sent to the global address [1], so add a
check to drop TP.CM_BAM sent to a non-global address.

Without this patch, the receiver will treat the following packets as
normal RTS/CTS transport:
18EC0102#20090002FF002301
18EB0102#0100000000000000
18EB0102#020000FFFFFFFFFF

[1] SAE-J1939-82 2015 A.3.3 Row 1.

Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1635431907-15617-4-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: NOleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

164051a6

can: j1939: j1939_can_recv(): ignore messages with invalid source address · a79305e1

由 Zhang Changzhong 提交于 10月 28, 2021

According to SAE-J1939-82 2015 (A.3.6 Row 2), a receiver should never
send TP.CM_CTS to the global address, so we can add a check in
j1939_can_recv() to drop messages with invalid source address.

Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1635431907-15617-3-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: NOleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

a79305e1

can: j1939: j1939_tp_cmd_recv(): ignore abort message in the BAM transport · c0f49d98

由 Zhang Changzhong 提交于 10月 28, 2021

This patch prevents BAM transport from being closed by receiving abort
message, as specified in SAE-J1939-82 2015 (A.3.3 Row 4).

Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1635431907-15617-2-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: NOleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

c0f49d98

06 11月, 2021 2 次提交

ipv6: remove useless assignment to newinet in tcp_v6_syn_recv_sock() · 70bf363d

由 Nghia Le 提交于 11月 04, 2021

The newinet value is initialized with inet_sk() in a block code to
handle sockets for the ETH_P_IP protocol. Along this code path,
newinet is never read. Thus, assignment to newinet is needless and
can be removed.
Signed-off-by: NNghia Le <nghialm78@gmail.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211104143740.32446-1-nghialm78@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

70bf363d

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 9bea6aa4

由 Jakub Kicinski 提交于 11月 05, 2021

Daniel Borkmann says:

====================
pull-request: bpf 2021-11-05

We've added 15 non-merge commits during the last 3 day(s) which contain
a total of 14 files changed, 199 insertions(+), 90 deletions(-).

The main changes are:

1) Fix regression from stack spill/fill of <8 byte scalars, from Martin KaFai Lau.

2) Fix perf's build of bpftool's bootstrap version due to missing libbpf
   headers, from Quentin Monnet.

3) Fix riscv{32,64} BPF exception tables build errors and warnings, from Björn Töpel.

4) Fix bpf fs to allow RENAME_EXCHANGE support for atomic upgrades on sk_lookup
   control planes, from Lorenz Bauer.

5) Fix libbpf's error reporting in bpf_map_lookup_and_delete_elem_flags() due to
   missing libbpf_err_errno(), from Mehrdad Arshad Rad.

6) Various fixes to make xdp_redirect_multi selftest more reliable, from Hangbin Liu.

7) Fix netcnt selftest to make it run serial and thus avoid conflicts with other
   cgroup/skb selftests run in parallel that could cause flakes, from Andrii Nakryiko.

8) Fix reuseport_bpf_numa networking selftest to skip unavailable NUMA nodes,
   from Kleber Sacilotto de Souza.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  riscv, bpf: Fix RV32 broken build, and silence RV64 warning
  selftests/bpf/xdp_redirect_multi: Limit the tests in netns
  selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
  selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
  selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
  libbpf: Fix lookup_and_delete_elem_flags error reporting
  bpftool: Install libbpf headers for the bootstrap version, too
  selftests/net: Fix reuseport_bpf_numa by skipping unavailable nodes
  selftests/bpf: Verifier test on refill from a smaller spill
  bpf: Do not reject when the stack read size is different from the tracked scalar size
  selftests/bpf: Make netcnt selftests serial to avoid spurious failures
  selftests/bpf: Test RENAME_EXCHANGE and RENAME_NOREPLACE on bpffs
  selftests/bpf: Convert test_bpffs to ASSERT macros
  libfs: Support RENAME_EXCHANGE in simple_rename()
  libfs: Move shmem_exchange to simple_rename_exchange
====================

Link: https://lore.kernel.org/r/20211105165803.29372-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9bea6aa4

05 11月, 2021 1 次提交

riscv, bpf: Fix RV32 broken build, and silence RV64 warning · f47d4ffe

由 Björn Töpel 提交于 11月 03, 2021

Commit 252c765b ("riscv, bpf: Add BPF exception tables") only addressed
RV64, and broke the RV32 build [1]. Fix by gating the exception tables code
with CONFIG_ARCH_RV64I.

Further, silence a "-Wmissing-prototypes" warning [2] in the RV64 BPF JIT.

  [1] https://lore.kernel.org/llvm/202111020610.9oy9Rr0G-lkp@intel.com/
  [2] https://lore.kernel.org/llvm/202110290334.2zdMyRq4-lkp@intel.com/

Fixes: 252c765b ("riscv, bpf: Add BPF exception tables")
Signed-off-by: NBjörn Töpel <bjorn@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/bpf/20211103115453.397209-1-bjorn@kernel.org

f47d4ffe

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功