提交 · 93c5aecc35c61414073d848e1ba637fc2cae98a8 · openeuler / Kernel

21 1月, 2021 18 次提交

bpf,x64: Pad NOPs to make images converge more easily · 93c5aecc

由 Gary Lin 提交于 1月 19, 2021

The x64 bpf jit expects bpf images converge within the given passes, but
it could fail to do so with some corner cases. For example:

      l0:     ja 40
      l1:     ja 40

        [... repeated ja 40 ]

      l39:    ja 40
      l40:    ret #0

This bpf program contains 40 "ja 40" instructions which are effectively
NOPs and designed to be replaced with valid code dynamically. Ideally,
bpf jit should optimize those "ja 40" instructions out when translating
the bpf instructions into x64 machine code. However, do_jit() can only
remove one "ja 40" for offset==0 on each pass, so it requires at least
40 runs to eliminate those JMPs and exceeds the current limit of
passes(20). In the end, the program got rejected when BPF_JIT_ALWAYS_ON
is set even though it's legit as a classic socket filter.

To make bpf images more likely converge within 20 passes, this commit
pads some instructions with NOPs in the last 5 passes:

1. conditional jumps
  A possible size variance comes from the adoption of imm8 JMP. If the
  offset is imm8, we calculate the size difference of this BPF instruction
  between the previous and the current pass and fill the gap with NOPs.
  To avoid the recalculation of jump offset, those NOPs are inserted before
  the JMP code, so we have to subtract the 2 bytes of imm8 JMP when
  calculating the NOP number.

2. BPF_JA
  There are two conditions for BPF_JA.
  a.) nop jumps
    If this instruction is not optimized out in the previous pass,
    instead of removing it, we insert the equivalent size of NOPs.
  b.) label jumps
    Similar to condition jumps, we prepend NOPs right before the JMP
    code.

To make the code concise, emit_nops() is modified to use the signed len and
return the number of inserted NOPs.

For bpf-to-bpf, we always enable padding for the extra pass since there
is only one extra run and the jump padding doesn't affected the images
that converge without padding.

After applying this patch, the corner case was loaded with the following
jit code:

    flen=45 proglen=77 pass=17 image=ffffffffc03367d4 from=jump pid=10097
    JIT code: 00000000: 0f 1f 44 00 00 55 48 89 e5 53 41 55 31 c0 45 31
    JIT code: 00000010: ed 48 89 fb eb 30 eb 2e eb 2c eb 2a eb 28 eb 26
    JIT code: 00000020: eb 24 eb 22 eb 20 eb 1e eb 1c eb 1a eb 18 eb 16
    JIT code: 00000030: eb 14 eb 12 eb 10 eb 0e eb 0c eb 0a eb 08 eb 06
    JIT code: 00000040: eb 04 eb 02 66 90 31 c0 41 5d 5b c9 c3

     0: 0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
     5: 55                      push   rbp
     6: 48 89 e5                mov    rbp,rsp
     9: 53                      push   rbx
     a: 41 55                   push   r13
     c: 31 c0                   xor    eax,eax
     e: 45 31 ed                xor    r13d,r13d
    11: 48 89 fb                mov    rbx,rdi
    14: eb 30                   jmp    0x46
    16: eb 2e                   jmp    0x46
        ...
    3e: eb 06                   jmp    0x46
    40: eb 04                   jmp    0x46
    42: eb 02                   jmp    0x46
    44: 66 90                   xchg   ax,ax
    46: 31 c0                   xor    eax,eax
    48: 41 5d                   pop    r13
    4a: 5b                      pop    rbx
    4b: c9                      leave
    4c: c3                      ret

At the 16th pass, 15 jumps were already optimized out, and one jump was
replaced with NOPs at 44 and the image converged at the 17th pass.

v4:
  - Add the detailed comments about the possible padding bytes

v3:
  - Copy the instructions of prologue separately or the size calculation
    of the first BPF instruction would include the prologue.
  - Replace WARN_ONCE() with pr_err() and EFAULT
  - Use MAX_PASSES in the for loop condition check
  - Remove the "padded" flag from x64_jit_data. For the extra pass of
    subprogs, padding is always enabled since it won't hurt the images
    that converge without padding.

v2:
  - Simplify the sample code in the description and provide the jit code
  - Check the expected padding bytes with WARN_ONCE
  - Move the 'padded' flag to 'struct x64_jit_data'
Signed-off-by: NGary Lin <glin@suse.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210119102501.511-2-glin@suse.com

93c5aecc

docs, bpf: Add minimal markup to address doc warning · d2e04b9d

由 Lukas Bulwahn 提交于 1月 18, 2021

Commit 91c960b0 ("bpf: Rename BPF_XADD and prepare to encode other
atomics in .imm") modified the BPF documentation, but missed some ReST
markup.

Hence, make htmldocs warns on Documentation/networking/filter.rst:1053:

WARNING: Inline emphasis start-string without end-string.

Add some minimal markup to address this warning.

Fixes: 91c960b0 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm")
Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBrendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/bpf/20210118080004.6367-1-lukas.bulwahn@gmail.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

d2e04b9d

samples/bpf: Add BPF_ATOMIC_OP macro for BPF samples · da9d35e2

由 Björn Töpel 提交于 1月 18, 2021

Brendan Jackman added extend atomic operations to the BPF instruction
set in commit 7064a734 ("Merge branch 'Atomics for eBPF'"), which
introduces the BPF_ATOMIC_OP macro. However, that macro was missing
for the BPF samples. Fix that by adding it into bpf_insn.h.

Fixes: 91c960b0 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm")
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NBrendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/bpf/20210118091753.107572-1-bjorn.topel@gmail.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

da9d35e2

net, xdp: Introduce xdp_build_skb_from_frame utility routine · 89f479f0

由 Lorenzo Bianconi 提交于 1月 12, 2021

Introduce xdp_build_skb_from_frame utility routine to build the skb
from xdp_frame. Respect to __xdp_build_skb_from_frame,
xdp_build_skb_from_frame will allocate the skb object. Rely on
xdp_build_skb_from_frame in veth driver.
Introduce missing xdp metadata support in veth_xdp_rcv_one routine.
Add missing metadata support in veth_xdp_rcv_one().
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NToshiaki Makita <toshiaki.makita1@gmail.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/94ade9e853162ae1947941965193190da97457bc.1610475660.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

89f479f0

net, xdp: Introduce __xdp_build_skb_from_frame utility routine · 97a0e1ea

由 Lorenzo Bianconi 提交于 1月 12, 2021

Introduce __xdp_build_skb_from_frame utility routine to build
the skb from xdp_frame. Rely on __xdp_build_skb_from_frame in
cpumap code.
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/4f9f4c6b3dd3933770c617eb6689dbc0c6e25863.1610475660.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

97a0e1ea

bpf, selftests: Fold test_current_pid_tgid_new_ns into test_progs. · 09c02d55

由 Carlos Neira 提交于 1月 14, 2021

Currently tests for bpf_get_ns_current_pid_tgid() are outside test_progs.
This change folds test cases into test_progs.

Changes from v11:

 - Fixed test failure is not detected.
 - Removed EXIT(3) call as it will stop test_progs execution.
Signed-off-by: NCarlos Neira <cneirabustos@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210114141033.GA17348@localhostSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

09c02d55

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 0fe2f273

由 Jakub Kicinski 提交于 1月 20, 2021

Conflicts:

drivers/net/can/dev.c
  commit 03f16c50 ("can: dev: can_restart: fix use after free bug")
  commit 3e77f70e ("can: dev: move driver related infrastructure into separate subdir")

  Code move.

drivers/net/dsa/b53/b53_common.c
 commit 8e4052c3 ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
 commit b7a9e0da ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")

 Field rename.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0fe2f273

Merge tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 75439bc4

由 Linus Torvalds 提交于 1月 20, 2021

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.11-rc5, including fixes from bpf, wireless, and
  can trees.

  Current release - regressions:

   - nfc: nci: fix the wrong NCI_CORE_INIT parameters

  Current release - new code bugs:

   - bpf: allow empty module BTFs

  Previous releases - regressions:

   - bpf: fix signed_{sub,add32}_overflows type handling

   - tcp: do not mess with cloned skbs in tcp_add_backlog()

   - bpf: prevent double bpf_prog_put call from bpf_tracing_prog_attach

   - bpf: don't leak memory in bpf getsockopt when optlen == 0

   - tcp: fix potential use-after-free due to double kfree()

   - mac80211: fix encryption issues with WEP

   - devlink: use right genl user_ptr when handling port param get/set

   - ipv6: set multicast flag on the multicast route

   - tcp: fix TCP_USER_TIMEOUT with zero window

  Previous releases - always broken:

   - bpf: local storage helpers should check nullness of owner ptr passed

   - mac80211: fix incorrect strlen of .write in debugfs

   - cls_flower: call nla_ok() before nla_next()

   - skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too"

* tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  net: systemport: free dev before on error path
  net: usb: cdc_ncm: don't spew notifications
  net: mscc: ocelot: Fix multicast to the CPU port
  tcp: Fix potential use-after-free due to double kfree()
  bpf: Fix signed_{sub,add32}_overflows type handling
  can: peak_usb: fix use after free bugs
  can: vxcan: vxcan_xmit: fix use after free bug
  can: dev: can_restart: fix use after free bug
  tcp: fix TCP socket rehash stats mis-accounting
  net: dsa: b53: fix an off by one in checking "vlan->vid"
  tcp: do not mess with cloned skbs in tcp_add_backlog()
  selftests: net: fib_tests: remove duplicate log test
  net: nfc: nci: fix the wrong NCI_CORE_INIT parameters
  sh_eth: Fix power down vs. is_opened flag ordering
  net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
  netfilter: rpfilter: mask ecn bits before fib lookup
  udp: mask TOS bits in udp_v4_early_demux()
  xsk: Clear pool even for inactive queues
  bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback
  sh_eth: Make PHY access aware of Runtime PM to fix reboot crash
  ...

75439bc4

Merge tag 'for-linus-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 2e4ceed6

由 Linus Torvalds 提交于 1月 20, 2021

Pull xen fix from Juergen Gross:
 "A fix for build failure showing up in some configurations"

* tag 'for-linus-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: fix 'nopvspin' build error

2e4ceed6

X.509: Fix crash caused by NULL pointer · 7178a107

由 Tianjia Zhang 提交于 1月 19, 2021

On the following call path, `sig->pkey_algo` is not assigned
in asymmetric_key_verify_signature(), which causes runtime
crash in public_key_verify_signature().

  keyctl_pkey_verify
    asymmetric_key_verify_signature
      verify_signature
        public_key_verify_signature

This patch simply check this situation and fixes the crash
caused by NULL pointer.

Fixes: 21552563 ("X.509: support OSCCA SM2-with-SM3 certificate verification")
Reported-by: NTobias Markus <tobias@markus-regensburg.de>
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-and-tested-by: NToke Høiland-Jørgensen <toke@redhat.com>
Tested-by: NJoão Fonseca <jpedrofonseca@ua.pt>
Acked-by: NJarkko Sakkinen <jarkko@kernel.org>
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7178a107

cachefiles: Drop superfluous readpages aops NULL check · db58465f

由 Takashi Iwai 提交于 1月 20, 2021

After the recent actions to convert readpages aops to readahead, the
NULL checks of readpages aops in cachefiles_read_or_alloc_page() may
hit falsely.  More badly, it's an ASSERT() call, and this panics.

Drop the superfluous NULL checks for fixing this regression.

[DH: Note that cachefiles never actually used readpages, so this check was
 never actually necessary]

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208883
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1175245
Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

db58465f

Merge tag 'linux-can-fixes-for-5.11-20210120' of... · 535d3159

由 Jakub Kicinski 提交于 1月 20, 2021

Merge tag 'linux-can-fixes-for-5.11-20210120' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
linux-can-fixes-for-5.11-20210120

All three patches are by Vincent Mailhol and fix a potential use after free bug
in the CAN device infrastructure, the vxcan driver, and the peak_usk driver. In
the TX-path the skb is used to read from after it was passed to the networking
stack with netif_rx_ni().

* tag 'linux-can-fixes-for-5.11-20210120' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: peak_usb: fix use after free bugs
  can: vxcan: vxcan_xmit: fix use after free bug
  can: dev: can_restart: fix use after free bug
====================

Link: https://lore.kernel.org/r/20210120125202.2187358-1-mkl@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>

535d3159

net: systemport: free dev before on error path · 0c630a66

由 Pan Bian 提交于 1月 19, 2021

On the error path, it should goto the error handling label to free
allocated memory rather than directly return.

Fixes: 31bc72d9 ("net: systemport: fetch and use clock resources")
Signed-off-by: NPan Bian <bianpan2016@163.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210120044423.1704-1-bianpan2016@163.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

0c630a66

net: usb: cdc_ncm: don't spew notifications · de658a19

由 Grant Grundler 提交于 1月 19, 2021

RTL8156 sends notifications about every 32ms.
Only display/log notifications when something changes.

This issue has been reported by others:
	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472
	https://lkml.org/lkml/2020/8/27/1083

...
[785962.779840] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[785962.929944] usb 1-1: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=30.00
[785962.929949] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[785962.929952] usb 1-1: Product: USB 10/100/1G/2.5G LAN
[785962.929954] usb 1-1: Manufacturer: Realtek
[785962.929956] usb 1-1: SerialNumber: 000000001
[785962.991755] usbcore: registered new interface driver cdc_ether
[785963.017068] cdc_ncm 1-1:2.0: MAC-Address: 00:24:27:88:08:15
[785963.017072] cdc_ncm 1-1:2.0: setting rx_max = 16384
[785963.017169] cdc_ncm 1-1:2.0: setting tx_max = 16384
[785963.017682] cdc_ncm 1-1:2.0 usb0: register 'cdc_ncm' at usb-0000:00:14.0-1, CDC NCM, 00:24:27:88:08:15
[785963.019211] usbcore: registered new interface driver cdc_ncm
[785963.023856] usbcore: registered new interface driver cdc_wdm
[785963.025461] usbcore: registered new interface driver cdc_mbim
[785963.038824] cdc_ncm 1-1:2.0 enx002427880815: renamed from usb0
[785963.089586] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.121673] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.153682] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
...

This is about 2KB per second and will overwrite all contents of a 1MB
dmesg buffer in under 10 minutes rendering them useless for debugging
many kernel problems.

This is also an extra 180 MB/day in /var/logs (or 1GB per week) rendering
the majority of those logs useless too.

When the link is up (expected state), spew amount is >2x higher:
...
[786139.600992] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.632997] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.665097] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.697100] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.729094] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.761108] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
...

Chrome OS cannot support RTL8156 until this is fixed.
Signed-off-by: NGrant Grundler <grundler@chromium.org>
Reviewed-by: NHayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20210120011208.3768105-1-grundler@chromium.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

de658a19

net: mscc: ocelot: Fix multicast to the CPU port · 584b7cfc

由 Alban Bedel 提交于 1月 19, 2021

Multicast entries in the MAC table use the high bits of the MAC
address to encode the ports that should get the packets. But this port
mask does not work for the CPU port, to receive these packets on the
CPU port the MAC_CPU_COPY flag must be set.

Because of this IPv6 was effectively not working because neighbor
solicitations were never received. This was not apparent before commit
9403c158 (net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb
entries) as the IPv6 entries were broken so all incoming IPv6
multicast was then treated as unknown and flooded on all ports.

To fix this problem rework the ocelot_mact_learn() to set the
MAC_CPU_COPY flag when a multicast entry that target the CPU port is
added. For this we have to read back the ports endcoded in the pseudo
MAC address by the caller. It is not a very nice design but that avoid
changing the callers and should make backporting easier.
Signed-off-by: NAlban Bedel <alban.bedel@aerq.com>
Fixes: 9403c158 ("net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb entries")
Link: https://lore.kernel.org/r/20210119140638.203374-1-alban.bedel@aerq.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

584b7cfc

tcp: Fix potential use-after-free due to double kfree() · c89dffc7

由 Kuniyuki Iwashima 提交于 1月 18, 2021

Receiving ACK with a valid SYN cookie, cookie_v4_check() allocates struct
request_sock and then can allocate inet_rsk(req)->ireq_opt. After that,
tcp_v4_syn_recv_sock() allocates struct sock and copies ireq_opt to
inet_sk(sk)->inet_opt. Normally, tcp_v4_syn_recv_sock() inserts the full
socket into ehash and sets NULL to ireq_opt. Otherwise,
tcp_v4_syn_recv_sock() has to reset inet_opt by NULL and free the full
socket.

The commit 01770a16 ("tcp: fix race condition when creating child
sockets from syncookies") added a new path, in which more than one cores
create full sockets for the same SYN cookie. Currently, the core which
loses the race frees the full socket without resetting inet_opt, resulting
in that both sock_put() and reqsk_put() call kfree() for the same memory:

  sock_put
    sk_free
      __sk_free
        sk_destruct
          __sk_destruct
            sk->sk_destruct/inet_sock_destruct
              kfree(rcu_dereference_protected(inet->inet_opt, 1));

  reqsk_put
    reqsk_free
      __reqsk_free
        req->rsk_ops->destructor/tcp_v4_reqsk_destructor
          kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1));

Calling kmalloc() between the double kfree() can lead to use-after-free, so
this patch fixes it by setting NULL to inet_opt before sock_put().

As a side note, this kind of issue does not happen for IPv6. This is
because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which
correspond to ireq_opt in IPv4.

Fixes: 01770a16 ("tcp: fix race condition when creating child sockets from syncookies")
CC: Ricardo Dias <rdias@singlestore.com>
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: NBenjamin Herrenschmidt <benh@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210118055920.82516-1-kuniyu@amazon.co.jpSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c89dffc7

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · b3741b43

由 Jakub Kicinski 提交于 1月 20, 2021

Daniel Borkmann says:

====================
pull-request: bpf 2021-01-20

1) Fix wrong bpf_map_peek_elem_proto helper callback, from Mircea Cirjaliu.

2) Fix signed_{sub,add32}_overflows type truncation, from Daniel Borkmann.

3) Fix AF_XDP to also clear pools for inactive queues, from Maxim Mikityanskiy.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix signed_{sub,add32}_overflows type handling
  xsk: Clear pool even for inactive queues
  bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback
====================

Link: https://lore.kernel.org/r/20210120163439.8160-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b3741b43

bpf: Fix signed_{sub,add32}_overflows type handling · bc895e8b

由 Daniel Borkmann 提交于 1月 20, 2021

Fix incorrect signed_{sub,add32}_overflows() input types (and a related buggy
comment). It looks like this might have slipped in via copy/paste issue, also
given prior to 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking")
the signature of signed_sub_overflows() had s64 a and s64 b as its input args
whereas now they are truncated to s32. Thus restore proper types. Also, the case
of signed_add32_overflows() is not consistent to signed_sub32_overflows(). Both
have s32 as inputs, therefore align the former.

Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: NDe4dCr0w <sa516203@mail.ustc.edu.cn>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>

bc895e8b

20 1月, 2021 22 次提交

can: peak_usb: fix use after free bugs · 50aca891

由 Vincent Mailhol 提交于 1月 20, 2021

After calling peak_usb_netif_rx_ni(skb), dereferencing skb is unsafe.
Especially, the can_frame cf which aliases skb memory is accessed
after the peak_usb_netif_rx_ni().

Reordering the lines solves the issue.

Fixes: 0a25e1f4 ("can: peak_usb: add support for PEAK new CANFD USB adapters")
Link: https://lore.kernel.org/r/20210120114137.200019-4-mailhol.vincent@wanadoo.frSigned-off-by: NVincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

50aca891

can: vxcan: vxcan_xmit: fix use after free bug · 75854cad

由 Vincent Mailhol 提交于 1月 20, 2021

After calling netif_rx_ni(skb), dereferencing skb is unsafe.
Especially, the canfd_frame cfd which aliases skb memory is accessed
after the netif_rx_ni().

Fixes: a8f820a3 ("can: add Virtual CAN Tunnel driver (vxcan)")
Link: https://lore.kernel.org/r/20210120114137.200019-3-mailhol.vincent@wanadoo.frSigned-off-by: NVincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

75854cad

can: dev: can_restart: fix use after free bug · 03f16c50

由 Vincent Mailhol 提交于 1月 20, 2021

After calling netif_rx_ni(skb), dereferencing skb is unsafe.
Especially, the can_frame cf which aliases skb memory is accessed
after the netif_rx_ni() in:
stats->rx_bytes += cf->len;

Reordering the lines solves the issue.

Fixes: 39549eef ("can: CAN Network device driver and Netlink interface")
Link: https://lore.kernel.org/r/20210120114137.200019-2-mailhol.vincent@wanadoo.frSigned-off-by: NVincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

03f16c50

tcp: fix TCP socket rehash stats mis-accounting · 9c30ae83

由 Yuchung Cheng 提交于 1月 19, 2021

The previous commit 32efcc06 ("tcp: export count for rehash attempts")
would mis-account rehashing SNMP and socket stats:

  a. During handshake of an active open, only counts the first
     SYN timeout

  b. After handshake of passive and active open, stop updating
     after (roughly) TCP_RETRIES1 recurring RTOs

  c. After the socket aborts, over count timeout_rehash by 1

This patch fixes this by checking the rehash result from sk_rethink_txhash.

Fixes: 32efcc06 ("tcp: export count for rehash attempts")
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20210119192619.1848270-1-ycheng@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9c30ae83

net: dsa: b53: fix an off by one in checking "vlan->vid" · 8e4052c3

由 Dan Carpenter 提交于 1月 19, 2021

The > comparison should be >= to prevent accessing one element beyond
the end of the dev->vlans[] array in the caller function, b53_vlan_add().
The "dev->vlans" array is allocated in the b53_switch_init() function
and it has "dev->num_vlans" elements.

Fixes: a2482d2c ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/YAbxI97Dl/pmBy5V@mwandaSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8e4052c3

bonding: add a vlan+srcmac tx hashing option · 7b8fc010

由 Jarod Wilson 提交于 1月 18, 2021

This comes from an end-user request, where they're running multiple VMs on
hosts with bonded interfaces connected to some interest switch topologies,
where 802.3ad isn't an option. They're currently running a proprietary
solution that effectively achieves load-balancing of VMs and bandwidth
utilization improvements with a similar form of transmission algorithm.

Basically, each VM has it's own vlan, so it always sends its traffic out
the same interface, unless that interface fails. Traffic gets split
between the interfaces, maintaining a consistent path, with failover still
available if an interface goes down.

Unlike bond_eth_hash(), this hash function is using the full source MAC
address instead of just the last byte, as there are so few components to
the hash, and in the no-vlan case, we would be returning just the last
byte of the source MAC as the hash value. It's entirely possible to have
two NICs in a bond with the same last byte of their MAC, but not the same
MAC, so this adjustment should guarantee distinct hashes in all cases.

This has been rudimetarily tested to provide similar results to the
proprietary solution it is aiming to replace. A patch for iproute2 is also
posted, to properly support the new mode there as well.

Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Thomas Davis <tadavis@lbl.gov>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Link: https://lore.kernel.org/r/20210119010927.1191922-1-jarod@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

7b8fc010

net: fix GSO for SG-enabled devices · 00b229f7

由 Paolo Abeni 提交于 1月 19, 2021

The commit dbd50f23 ("net: move the hsize check to the else
block in skb_segment") introduced a data corruption for devices
supporting scatter-gather.

The problem boils down to signed/unsigned comparison given
unexpected results: if signed 'hsize' is negative, it will be
considered greater than a positive 'len', which is unsigned.

This commit addresses resorting to the old checks order, so that
'hsize' never has a negative value when compared with 'len'.

v1 -> v2:
 - reorder hsize checks instead of explicit cast (Alex)
Bisected-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Fixes: dbd50f23 ("net: move the hsize check to the else block in skb_segment")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NXin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/861947c2d2d087db82af93c21920ce8147d15490.1611074818.git.pabeni@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

00b229f7

tcp: do not mess with cloned skbs in tcp_add_backlog() · b160c285

由 Eric Dumazet 提交于 1月 19, 2021

Heiner Kallweit reported that some skbs were sent with
the following invalid GSO properties :
- gso_size > 0
- gso_type == 0

This was triggerring a WARN_ON_ONCE() in rtl8169_tso_csum_v2.

Juerg Haefliger was able to reproduce a similar issue using
a lan78xx NIC and a workload mixing TCP incoming traffic
and forwarded packets.

The problem is that tcp_add_backlog() is writing
over gso_segs and gso_size even if the incoming packet will not
be coalesced to the backlog tail packet.

While skb_try_coalesce() would bail out if tail packet is cloned,
this overwriting would lead to corruptions of other packets
cooked by lan78xx, sharing a common super-packet.

The strategy used by lan78xx is to use a big skb, and split
it into all received packets using skb_clone() to avoid copies.
The drawback of this strategy is that all the small skb share a common
struct skb_shared_info.

This patch rewrites TCP gso_size/gso_segs handling to only
happen on the tail skb, since skb_try_coalesce() made sure
it was not cloned.

Fixes: 4f693b55 ("tcp: implement coalescing on backlog queue")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Bisected-by: NJuerg Haefliger <juergh@canonical.com>
Tested-by: NJuerg Haefliger <juergh@canonical.com>
Reported-by: NHeiner Kallweit <hkallweit1@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209423
Link: https://lore.kernel.org/r/20210119164900.766957-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b160c285

octeontx2-af: Remove unneeded semicolons · fc6f89dd

由 Xu Wang 提交于 1月 19, 2021

fix semicolon.cocci warnings:
./drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c:272:2-3: Unneeded semicolon
drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c:1788:3-4: Unneeded semicolon
drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c:1809:3-4: Unneeded semicolon
drivers/net/ethernet/marvell/octeontx2/af/rvu.c:1326:2-3: Unneeded semicolon
Signed-off-by: NXu Wang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20210119075059.17493-1-vulab@iscas.ac.cn
Link: https://lore.kernel.org/r/20210119075507.17699-1-vulab@iscas.ac.cn
Link: https://lore.kernel.org/r/20210119080037.17931-1-vulab@iscas.ac.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>

fc6f89dd

net: smsc911x: Make Runtime PM handling more fine-grained · 1e30b8d7

由 Geert Uytterhoeven 提交于 1月 18, 2021

Currently the smsc911x driver has mininal power management: during
driver probe, the device is powered up, and during driver remove, it is
powered down.

Improve power management by making it more fine-grained:
1. Power the device down when driver probe is finished,
2. Power the device (down) when it is opened (closed),
3. Make sure the device is powered during PHY access.
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20210118150857.796943-1-geert+renesas@glider.beSigned-off-by: NJakub Kicinski <kuba@kernel.org>

1e30b8d7

selftests: forwarding: Fix spelling mistake "succeded" -> "succeeded" · eaaf6112

由 Colin Ian King 提交于 1月 18, 2021

There are two spelling mistakes in check_fail messages. Fix them.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210118111902.71096-1-colin.king@canonical.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

eaaf6112

net: tun: fix misspellings using codespell tool · c2e315b8

由 Menglong Dong 提交于 1月 18, 2021

Some typos are found out by codespell tool:

$ codespell -w -i 3 ./drivers/net/tun.c
aovid  ==> avoid

Fix typos found by codespell.
Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn>
Link: https://lore.kernel.org/r/20210118111539.35886-1-dong.menglong@zte.com.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c2e315b8

taprio: boolean values to a bool variable · 0deee7aa

由 Jiapeng Zhong 提交于 1月 18, 2021

Fix the following coccicheck warnings:

./net/sched/sch_taprio.c:393:3-16: WARNING: Assignment of 0/1 to bool
variable.

./net/sched/sch_taprio.c:375:2-15: WARNING: Assignment of 0/1 to bool
variable.

./net/sched/sch_taprio.c:244:4-19: WARNING: Assignment of 0/1 to bool
variable.
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Zhong <abaci-bugfix@linux.alibaba.com>
Link: https://lore.kernel.org/r/1610958662-71166-1-git-send-email-abaci-bugfix@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

0deee7aa

Merge branch 'net-ethernet-ti-am65-cpsw-nuss-introduce-support-for-am64x-cpsw3g' · 719fc6b7

由 Jakub Kicinski 提交于 1月 19, 2021

Grygorii Strashko says:

====================
net: ethernet: ti: am65-cpsw-nuss: introduce support for am64x cpsw3g

This series introduces basic support for recently introduced TI K3 AM642x SoC [1]
which contains 3 port (2 external ports) CPSW3g module. The CPSW3g integrated
in MAIN domain and can be configured in multi port or switch modes.
In this series only multi port mode is enabled. The initial version of switchdev
support was introduced by Vignesh Raghavendra [2] and work is in progress.

The overall functionality and DT bindings are similar to other K3 CPSWxg
versions, so DT binding changes are minimal and driver is mostly re-used for
TI K3 AM642x CPSW3g.

The main difference is that TI K3 AM642x SoC is not fully DMA coherent and
instead DMA coherency is supported per DMA channel.

Patches 1-2 - DT bindings update
Patches 3-4 - Update driver to support changed DMA coherency model
Patches 5-6 - adds TI K3 AM642x SoC platform data and so enable CPSW3g

[1] https://www.ti.com/lit/pdf/spruim2
[2] https://patchwork.ozlabs.org/project/netdev/cover/20201130082046.16292-1-vigneshr@ti.com/
====================

Link: https://lore.kernel.org/r/20210115192853.5469-1-grygorii.strashko@ti.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

719fc6b7

net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g · 4f7cce27

由 Vignesh Raghavendra 提交于 1月 15, 2021

The TI AM64x SoCs Gigabit Ethernet Switch subsystem (CPSW3g NUSS) has three
ports (2 ext. ports) and provides Ethernet packet communication for the
device and can be configured in multi port mode or as an Ethernet switch.

This patch adds support for the corresponding CPSW3g version.
Signed-off-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

4f7cce27

net: ti: cpsw_ale: add driver data for AM64 CPSW3g · 1dd38410

由 Vignesh Raghavendra 提交于 1月 15, 2021

The AM642x CPSW3g is similar to j721e-cpswxg except its ALE table size is
512 entries. Add entry for the same.
Signed-off-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1dd38410

net: ethernet: ti: am65-cpsw-nuss: Support for transparent ASEL handling · 39fd0547

由 Peter Ujfalusi 提交于 1月 15, 2021

Use the glue layer's functions to convert the dma_addr_t to and from CPPI5
address (with the ASEL bits), which should be used within the descriptors
and data buffers.

- Per channel coherency support
The DMAs use the 'ASEL' bits to select data and configuration fetch path.
The ASEL bits are placed at the unused parts of any address field used by
the DMAs (pointers to descriptors, addresses in descriptors, ring base
addresses). The ASEL is not part of the address (the DMAs can address
48bits). Individual channels can be configured to be coherent (via ACP
port) or non coherent individually by configuring the ASEL to appropriate
value.

[1] https://lore.kernel.org/patchwork/cover/1350756/Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
Co-developed-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

39fd0547

net: ethernet: ti: am65-cpsw-nuss: Use DMA device for DMA API · ed569ed9

由 Peter Ujfalusi 提交于 1月 15, 2021

For DMA API the DMA device should be used as cpsw does not accesses to
descriptors or data buffers in any ways. The DMA does.

Also, drop dma_coerce_mask_and_coherent() setting on CPSW device, as it
should be done by DMA driver which does data movement.

This is required for adding AM64x CPSW3g support where DMA coherency
supported per DMA channel.
Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
Co-developed-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NVignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ed569ed9

dt-binding: net: ti: k3-am654-cpsw-nuss: update bindings for am64x cpsw3g · 19d9a846

由 Grygorii Strashko 提交于 1月 15, 2021

Update DT binding for recently introduced TI K3 AM642x SoC [1] which
contains 3 port (2 external ports) CPSW3g module. The CPSW3g integrated
in MAIN domain and can be configured in multi port or switch modes.

The overall functionality and DT bindings are similar to other K3 CPSWxg
versions, so DT binding changes are minimal:
 - reword description
 - add new compatible 'ti,am642-cpsw-nuss'
 - allow 2 external ports child nodes
 - add missed 'assigned-clock' props

[1] https://www.ti.com/lit/pdf/spruim2Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

19d9a846

dt-binding: ti: am65x-cpts: add assigned-clock and power-domains props · b3228c74

由 Grygorii Strashko 提交于 1月 15, 2021

The CPTS clock is usually a clk-mux which allows to select CPTS reference
clock by using 'assigned-clock-parents', 'assigned-clocks' DT properties.
Also depending on integration the power-domains has to be specified to
enable CPTS IP.

Hence add 'assigned-clock-parents', 'assigned-clocks' and 'power-domains'
properties to the CPTS DT bindings to avoid dtbs_check warnings:
cpts@310d0000: 'assigned-clock-parents', 'assigned-clocks' do not match any of the regexes: 'pinctrl-[0-9]+'
cpts@310d0000: 'power-domains' does not match any of the regexes: 'pinctrl-[0-9]+'
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b3228c74

selftests: net: fib_tests: remove duplicate log test · fd23d2dc

由 Hangbin Liu 提交于 1月 19, 2021

The previous test added an address with a specified metric and check if
correspond route was created. I somehow added two logs for the same
test. Remove the duplicated one.
Reported-by: NAntoine Tenart <atenart@redhat.com>
Fixes: 0d29169a ("selftests/net/fib_tests: update addr_metric_test for peer route testing")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210119025930.2810532-1-liuhangbin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

fd23d2dc

net: nfc: nci: fix the wrong NCI_CORE_INIT parameters · 4964e5a1

由 Bongsu Jeon 提交于 1月 19, 2021

Fix the code because NCI_CORE_INIT_CMD includes two parameters in NCI2.0
but there is no parameters in NCI1.x.

Fixes: bcd684aa ("net/nfc/nci: Support NCI 2.x initial sequence")
Signed-off-by: NBongsu Jeon <bongsu.jeon@samsung.com>
Link: https://lore.kernel.org/r/20210118205522.317087-1-bongsu.jeon@samsung.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

4964e5a1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功