提交 · abb47dc95dc6e551ca79f51d296e77878fafa4d8 · openeuler / Kernel

18 7月, 2022 10 次提交

tls: rx: don't keep decrypted skbs on ctx->recv_pkt · abb47dc9

由 Jakub Kicinski 提交于 7月 14, 2022

Detach the skb from ctx->recv_pkt after decryption is done,
even if we can't consume it.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

abb47dc9

tls: rx: don't try to keep the skbs always on the list · 008141de

由 Jakub Kicinski 提交于 7月 14, 2022

I thought that having the skb either always on the ctx->rx_list
or ctx->recv_pkt will simplify the handling, as we would not
have to remember to flip it from one to the other on exit paths.

This became a little harder to justify with the fix for BPF
sockmaps. Subsequent changes will make the situation even worse.
Queue the skbs only when really needed.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008141de

tls: rx: allow only one reader at a time · 4cbc325e

由 Jakub Kicinski 提交于 7月 14, 2022

recvmsg() in TLS gets data from the skb list (rx_list) or fresh
skbs we read from TCP via strparser. The former holds skbs which were
already decrypted for peek or decrypted and partially consumed.

tls_wait_data() only notices appearance of fresh skbs coming out
of TCP (or psock). It is possible, if there is a concurrent call
to peek() and recv() that the peek() will move the data from input
to rx_list without recv() noticing. recv() will then read data out
of order or never wake up.

This is not a practical use case/concern, but it makes the self
tests less reliable. This patch solves the problem by allowing
only one reader in.

Because having multiple processes calling read()/peek() is not
normal avoid adding a lock and try to fast-path the single reader
case.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cbc325e

Merge branch 'net-smc-virt-contig-buffers' · 3898f52c

由 David S. Miller 提交于 7月 18, 2022

Wen Gu says:

====================
net/smc: Introduce virtually contiguous buffers for SMC-R

On long-running enterprise production servers, high-order contiguous
memory pages are usually very rare and in most cases we can only get
fragmented pages.

When replacing TCP with SMC-R in such production scenarios, attempting
to allocate high-order physically contiguous sndbufs and RMBs may result
in frequent memory compaction, which will cause unexpected hung issue
and further stability risks.

So this patch set is aimed to allow SMC-R link group to use virtually
contiguous sndbufs and RMBs to avoid potential issues mentioned above.
Whether to use physically or virtually contiguous buffers can be set
by sysctl smcr_buf_type.

Note that using virtually contiguous buffers will bring an acceptable
performance regression, which can be mainly divided into two parts:

1) regression in data path, which is brought by additional address
   translation of sndbuf by RNIC in Tx. But in general, translating
   address through MTT is fast. According to qperf test, this part
   regression is basically less than 10% in latency and bandwidth.
   (see patch 5/6 for details)

2) regression in buffer initialization and destruction path, which is
   brought by additional MR operations of sndbufs. But thanks to link
   group buffer reuse mechanism, the impact of this kind of regression
   decreases as times of buffer reuse increases.

Patch set overview:
- Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
  operation, which will reduce overhead on the data path, especially
  when using virtually contiguous buffers;
- Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
  of buffers in new created link group;
- Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
  including buffer creation, destruction, MR operation and access;
- patch 6/6 extends netlink attribute for buffer type of SMC-R link group;

v1->v2:
- Patch 5/6 fixes build issue on 32bit;
- Patch 3/6 adds description of new sysctl in smc-sysctl.rst;
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3898f52c

net/smc: Extend SMC-R link group netlink attribute · ddefb2d2

由 Wen Gu 提交于 7月 14, 2022

Extend SMC-R link group netlink attribute SMC_GEN_LGR_SMCR.
Introduce SMC_NLA_LGR_R_BUF_TYPE to show the buffer type of
SMC-R link group.
Signed-off-by: NWen Gu <guwen@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddefb2d2

net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R · b8d19945

由 Wen Gu 提交于 7月 14, 2022

On long-running enterprise production servers, high-order contiguous
memory pages are usually very rare and in most cases we can only get
fragmented pages.

When replacing TCP with SMC-R in such production scenarios, attempting
to allocate high-order physically contiguous sndbufs and RMBs may result
in frequent memory compaction, which will cause unexpected hung issue
and further stability risks.

So this patch is aimed to allow SMC-R link group to use virtually
contiguous sndbufs and RMBs to avoid potential issues mentioned above.
Whether to use physically or virtually contiguous buffers can be set
by sysctl smcr_buf_type.

Note that using virtually contiguous buffers will bring an acceptable
performance regression, which can be mainly divided into two parts:

1) regression in data path, which is brought by additional address
   translation of sndbuf by RNIC in Tx. But in general, translating
   address through MTT is fast.

   Taking 256KB sndbuf and RMB as an example, the comparisons in qperf
   latency and bandwidth test with physically and virtually contiguous
   buffers are as follows:

- client:
  smc_run taskset -c <cpu> qperf <server> -oo msg_size:1:64K:*2\
  -t 5 -vu tcp_{bw|lat}
- server:
  smc_run taskset -c <cpu> qperf

   [latency]
   msgsize              tcp            smcr        smcr-use-virt-buf
   1               11.17 us         7.56 us         7.51 us (-0.67%)
   2               10.65 us         7.74 us         7.56 us (-2.31%)
   4               11.11 us         7.52 us         7.59 us ( 0.84%)
   8               10.83 us         7.55 us         7.51 us (-0.48%)
   16              11.21 us         7.46 us         7.51 us ( 0.71%)
   32              10.65 us         7.53 us         7.58 us ( 0.61%)
   64              10.95 us         7.74 us         7.80 us ( 0.76%)
   128             11.14 us         7.83 us         7.87 us ( 0.47%)
   256             10.97 us         7.94 us         7.92 us (-0.28%)
   512             11.23 us         7.94 us         8.20 us ( 3.25%)
   1024            11.60 us         8.12 us         8.20 us ( 0.96%)
   2048            14.04 us         8.30 us         8.51 us ( 2.49%)
   4096            16.88 us         9.13 us         9.07 us (-0.64%)
   8192            22.50 us        10.56 us        11.22 us ( 6.26%)
   16384           28.99 us        12.88 us        13.83 us ( 7.37%)
   32768           40.13 us        16.76 us        16.95 us ( 1.16%)
   65536           68.70 us        24.68 us        24.85 us ( 0.68%)
   [bandwidth]
   msgsize                tcp              smcr          smcr-use-virt-buf
   1                1.65 MB/s         1.59 MB/s         1.53 MB/s (-3.88%)
   2                3.32 MB/s         3.17 MB/s         3.08 MB/s (-2.67%)
   4                6.66 MB/s         6.33 MB/s         6.09 MB/s (-3.85%)
   8               13.67 MB/s        13.45 MB/s        11.97 MB/s (-10.99%)
   16              25.36 MB/s        27.15 MB/s        24.16 MB/s (-11.01%)
   32              48.22 MB/s        54.24 MB/s        49.41 MB/s (-8.89%)
   64             106.79 MB/s       107.32 MB/s        99.05 MB/s (-7.71%)
   128            210.21 MB/s       202.46 MB/s       201.02 MB/s (-0.71%)
   256            400.81 MB/s       416.81 MB/s       393.52 MB/s (-5.59%)
   512            746.49 MB/s       834.12 MB/s       809.99 MB/s (-2.89%)
   1024          1292.33 MB/s      1641.96 MB/s      1571.82 MB/s (-4.27%)
   2048          2007.64 MB/s      2760.44 MB/s      2717.68 MB/s (-1.55%)
   4096          2665.17 MB/s      4157.44 MB/s      4070.76 MB/s (-2.09%)
   8192          3159.72 MB/s      4361.57 MB/s      4270.65 MB/s (-2.08%)
   16384         4186.70 MB/s      4574.13 MB/s      4501.17 MB/s (-1.60%)
   32768         4093.21 MB/s      4487.42 MB/s      4322.43 MB/s (-3.68%)
   65536         4057.14 MB/s      4735.61 MB/s      4555.17 MB/s (-3.81%)

2) regression in buffer initialization and destruction path, which is
   brought by additional MR operations of sndbufs. But thanks to link
   group buffer reuse mechanism, the impact of this kind of regression
   decreases as times of buffer reuse increases.

   Taking 256KB sndbuf and RMB as an example, latency of some key SMC-R
   buffer-related function obtained by bpftrace are as follows:

   Function                         Phys-bufs           Virt-bufs
   smcr_new_buf_create()             67154 ns            79164 ns
   smc_ib_buf_map_sg()                 525 ns              928 ns
   smc_ib_get_memory_region()       162294 ns           161191 ns
   smc_wr_reg_send()                  9957 ns             9635 ns
   smc_ib_put_memory_region()       203548 ns           198374 ns
   smc_ib_buf_unmap_sg()               508 ns             1158 ns

------------
Test environment notes:
1. Above tests run on 2 VMs within the same Host.
2. The NIC is ConnectX-4Lx, using SRIOV and passing through 2 VFs to
   the each VM respectively.
3. VMs' vCPUs are binded to different physical CPUs, and the binded
   physical CPUs are isolated by `isolcpus=xxx` cmdline.
4. NICs' queue number are set to 1.
Signed-off-by: NWen Gu <guwen@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8d19945

net/smc: Use sysctl-specified types of buffers in new link group · b984f370

由 Wen Gu 提交于 7月 14, 2022

This patch introduces a new SMC-R specific element buf_type
in struct smc_link_group, for recording the value of sysctl
smcr_buf_type when link group is created.

New created link group will create and reuse buffers of the
type specified by buf_type.
Signed-off-by: NWen Gu <guwen@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b984f370

net/smc: Introduce a sysctl for setting SMC-R buffer type · 4bc5008e

由 Wen Gu 提交于 7月 14, 2022

This patch introduces the sysctl smcr_buf_type for setting
the type of SMC-R sndbufs and RMBs.

Valid values includes:

- SMCR_PHYS_CONT_BUFS, which means use physically contiguous
  buffers for better performance and is the default value.

- SMCR_VIRT_CONT_BUFS, which means use virtually contiguous
  buffers in case of physically contiguous memory is scarce.

- SMCR_MIXED_BUFS, which means first try to use physically
  contiguous buffers. If not available, then use virtually
  contiguous buffers.
Signed-off-by: NWen Gu <guwen@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bc5008e

net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu · 0ef69e78

由 Guangguan Wang 提交于 7月 14, 2022

Some CPU, such as Xeon, can guarantee DMA cache coherency.
So it is no need to use dma sync APIs to flush cache on such CPUs.
In order to avoid calling dma sync APIs on the IO path, use the
dma_need_sync to check whether smc_buf_desc needs dma sync when
creating smc_buf_desc.
Signed-off-by: NGuangguan Wang <guangguan.wang@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ef69e78

net/smc: remove redundant dma sync ops · 6d52e2de

由 Guangguan Wang 提交于 7月 14, 2022

smc_ib_sync_sg_for_cpu/device are the ops used for dma memory cache
consistency. Smc sndbufs are dma buffers, where CPU writes data to
it and PCIE device reads data from it. So for sndbufs,
smc_ib_sync_sg_for_device is needed and smc_ib_sync_sg_for_cpu is
redundant as PCIE device will not write the buffers. Smc rmbs
are dma buffers, where PCIE device write data to it and CPU read
data from it. So for rmbs, smc_ib_sync_sg_for_cpu is needed and
smc_ib_sync_sg_for_device is redundant as CPU will not write the buffers.
Signed-off-by: NGuangguan Wang <guangguan.wang@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d52e2de

16 7月, 2022 4 次提交

Merge branch 'net-ipv4-ipv6-new-option-to-accept-garp-untracked-na-only-if-in-network' · 2acd1022

由 Jakub Kicinski 提交于 7月 15, 2022

Jaehee Park says:

====================
net: ipv4/ipv6: new option to accept garp/untracked na only if in-network

The first patch adds an option to learn a neighbor from garp only if
the source ip is in the same subnet as an address configured on the
interface that received the garp message. The option has been added
to arp_accept in ipv4.

The same feature has been added to ndisc (patch 2). For ipv6, the
subnet filtering knob is an extension of the accept_untracked_na
option introduced in these patches:
https://lore.kernel.org/all/642672cb-8b11-c78f-8975-f287ece9e89e@gmail.com/t/
https://lore.kernel.org/netdev/20220530101414.65439-1-aajith@arista.com/T/

The third patch contains selftests for testing the different options
for accepting arp and neighbor advertisements.
====================

Link: https://lore.kernel.org/r/cover.1657755188.git.jhpark1013@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

2acd1022

selftests: net: arp_ndisc_untracked_subnets: test for arp_accept and accept_untracked_na · 0ea7b0a4

由 Jaehee Park 提交于 7月 13, 2022

ipv4 arp_accept has a new option '2' to create new neighbor entries
only if the src ip is in the same subnet as an address configured on
the interface that received the garp message. This selftest tests all
options in arp_accept.

ipv6 has a sysctl endpoint, accept_untracked_na, that defines the
behavior for accepting untracked neighbor advertisements. A new option
similar to that of arp_accept for learning only from the same subnet is
added to accept_untracked_na. This selftest tests this new feature.
Signed-off-by: NJaehee Park <jhpark1013@gmail.com>
Suggested-by: NRoopa Prabhu <roopa@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0ea7b0a4

net: ipv6: new accept_untracked_na option to accept na only if in-network · aaa5f515

由 Jaehee Park 提交于 7月 13, 2022

This patch adds a third knob, '2', which extends the
accept_untracked_na option to learn a neighbor only if the src ip is
in the same subnet as an address configured on the interface that
received the neighbor advertisement. This is similar to the arp_accept
configuration for ipv4.
Signed-off-by: NJaehee Park <jhpark1013@gmail.com>
Suggested-by: NRoopa Prabhu <roopa@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

aaa5f515

net: ipv4: new arp_accept option to accept garp only if in-network · e68c5dcf

由 Jaehee Park 提交于 7月 13, 2022

In many deployments, we want the option to not learn a neighbor from
garp if the src ip is not in the same subnet as an address configured
on the interface that received the garp message. net.ipv4.arp_accept
sysctl is currently used to control creation of a neigh from a
received garp packet. This patch adds a new option '2' to
net.ipv4.arp_accept which extends option '1' by including the subnet
check.
Signed-off-by: NJaehee Park <jhpark1013@gmail.com>
Suggested-by: NRoopa Prabhu <roopa@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e68c5dcf

15 7月, 2022 23 次提交

octeontx2-af: Set NIX link credits based on max LMAC · 459f326e

由 Sunil Goutham 提交于 7月 14, 2022

When number of LMACs active on a CGX/RPM are 3, then
current NIX link credit config based on per lmac fifo
length which inturn  is calculated as
'lmac_fifo_len = total_fifo_len / 3', is incorrect. In HW
one of the LMAC gets half of the FIFO and rest gets 1/4th.
Signed-off-by: NNithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: NSunil Goutham <sgoutham@marvell.com>
Signed-off-by: NGeetha Sowjanya <gakula@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

459f326e

octeontx2-af: Fixes static warnings · da92e03c

由 Ratheesh Kannoth 提交于 7月 14, 2022

Fixes smatch static tool warning reported by smatch tool.

rvu_npc_hash.c:1232 rvu_npc_exact_del_table_entry_by_id() error:
uninitialized symbol 'drop_mcam_idx'.

rvu_npc_hash.c:1312 rvu_npc_exact_add_table_entry() error:
uninitialized symbol 'drop_mcam_idx'.

rvu_npc_hash.c:1391 rvu_npc_exact_update_table_entry() error:
uninitialized symbol 'hash_index'.

rvu_npc_hash.c:1428 rvu_npc_exact_promisc_disable() error:
uninitialized symbol 'drop_mcam_idx'.

rvu_npc_hash.c:1473 rvu_npc_exact_promisc_enable() error:
uninitialized symbol 'drop_mcam_idx'.

otx2_dmac_flt.c:191 otx2_dmacflt_update() error: 'rsp'
dereferencing possible ERR_PTR()

otx2_dmac_flt.c:60 otx2_dmacflt_add_pfmac() error: 'rsp'
dereferencing possible ERR_PTR()
Signed-off-by: NRatheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da92e03c

net: dsa: qca8k: move driver to qca dir · 4bbaf764

由 Christian Marangi 提交于 7月 13, 2022

Move qca8k driver to qca dir in preparation for code split and
introduction of ipq4019 switch based on qca8k.
Signed-off-by: NChristian Marangi <ansuelsmth@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bbaf764

net/sched: sch_cbq: Delete unused delay_timer · 88b3822c

由 Peilin Ye 提交于 7月 13, 2022

delay_timer has been unused since commit c3498d34 ("cbq: remove
TCA_CBQ_OVL_STRATEGY support").  Delete it.
Signed-off-by: NPeilin Ye <peilin.ye@bytedance.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88b3822c

Merge tag 'mlx5-updates-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · c8fda7d2

由 Jakub Kicinski 提交于 7月 14, 2022

Saeed Mahameed says:

====================
mlx5-updates-2022-07-13

1) Support 802.1ad for bridge offloads

Vlad Buslov Says:
=================

Current mlx5 bridge VLAN offload implementation only supports 802.1Q VLAN
Ethernet protocol. That protocol type is assumed by default and
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification is ignored.

In order to support dynamically setting VLAN protocol handle
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification by flushing FDB and
re-creating VLAN modify header actions with a new protocol. Implement support
for 802.1ad protocol by saving the current VLAN protocol to per-bridge variable
and re-create the necessary flow groups according to its current value (either
use cvlan or svlan flow fields).
==================

2) debugfs to count ongoing FW commands

3) debugfs to query eswitch vport firmware diagnostic counters

4) Add missing meter configuration in flow action

5) Some misc cleanup

* tag 'mlx5-updates-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Remove the duplicating check for striding RQ when enabling LRO
  net/mlx5e: Move the LRO-XSK check to mlx5e_fix_features
  net/mlx5e: Extend flower police validation
  net/mlx5e: configure meter in flow action
  net/mlx5e: Removed useless code in function
  net/mlx5: Bridge, implement QinQ support
  net/mlx5: Bridge, implement infrastructure for VLAN protocol change
  net/mlx5: Bridge, extract VLAN push/pop actions creation
  net/mlx5: Bridge, rename filter fg to vlan_filter
  net/mlx5: Bridge, refactor groups sizes and indices
  net/mlx5: debugfs, Add num of in-use FW command interface slots
  net/mlx5: Expose vnic diagnostic counters for eswitch managed vports
  net/mlx5: Use software VHCA id when it's supported
  net/mlx5: Introduce ifc bits for using software vhca id
  net/mlx5: Use the bitmap API to allocate bitmaps
====================

Link: https://lore.kernel.org/r/20220713225859.401241-1-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c8fda7d2

Merge branch 'net-devlink-couple-of-trivial-fixes' · 6e6fbb72

由 Jakub Kicinski 提交于 7月 14, 2022

Jiri Pirko says:

====================
net: devlink: couple of trivial fixes

Just a couple of trivial fixes I found on the way.
====================

Link: https://lore.kernel.org/r/20220713141853.2992014-1-jiri@resnulli.usSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6e6fbb72

net: devlink: fix return statement in devlink_port_new_notify() · a44c4511

由 Jiri Pirko 提交于 7月 13, 2022

Return directly without intermediate value store at the end of
devlink_port_new_notify() function.
Signed-off-by: NJiri Pirko <jiri@nvidia.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

a44c4511

net: devlink: fix a typo in function name devlink_port_new_notifiy() · ced92571

由 Jiri Pirko 提交于 7月 13, 2022

Fix the typo in a name of devlink_port_new_notifiy() function.
Signed-off-by: NJiri Pirko <jiri@nvidia.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ced92571

net: devlink: make devlink_dpipe_headers_register() return void · 9a792366

由 Jiri Pirko 提交于 7月 13, 2022

The return value is not used, so change the return value type to void.
Signed-off-by: NJiri Pirko <jiri@nvidia.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

9a792366

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 816cd168

由 Jakub Kicinski 提交于 7月 14, 2022

include/net/sock.h
  310731e2 ("net: Fix data-races around sysctl_mem.")
  e70f3c70 ("Revert "net: set SK_MEM_QUANTUM to 4096"")
https://lore.kernel.org/all/20220711120211.7c8b7cba@canb.auug.org.au/

net/ipv4/fib_semantics.c
  747c1430 ("ip: fix dflt addr selection for connected nexthop")
  d62607c3 ("net: rename reference+tracking helpers")

net/tls/tls.h
include/net/tls.h
  3d8c51b2 ("net/tls: Check for errors in tls_device_init")
  58790314 ("tls: create an internal header")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

816cd168

x86/speculation: Use DECLARE_PER_CPU for x86_spec_ctrl_current · db886979

由 Nathan Chancellor 提交于 7月 13, 2022

Clang warns:

  arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection]
  DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
                      ^
  arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here
  extern u64 x86_spec_ctrl_current;
             ^
  1 error generated.

The declaration should be using DECLARE_PER_CPU instead so all
attributes stay in sync.

Cc: stable@vger.kernel.org
Fixes: fc02735b ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

db886979

Merge tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9bd572ec

由 Linus Torvalds 提交于 7月 14, 2022

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bpf and wireless.

  Still no major regressions, the release continues to be calm. An
  uptick of fixes this time around due to trivial data race fixes and
  patches flowing down from subtrees.

  There has been a few driver fixes (particularly a few fixes for false
  positives due to 66e4c8d9 which went into -next in May!) that make
  me worry the wide testing is not exactly fully through.

  So "calm" but not "let's just cut the final ASAP" vibes over here.

  Current release - regressions:

   - wifi: rtw88: fix write to const table of channel parameters

  Current release - new code bugs:

   - mac80211: add gfp_t arg to ieeee80211_obss_color_collision_notify

   - mlx5:
      - TC, allow offload from uplink to other PF's VF
      - Lag, decouple FDB selection and shared FDB
      - Lag, correct get the port select mode str

   - bnxt_en: fix and simplify XDP transmit path

   - r8152: fix accessing unset transport header

  Previous releases - regressions:

   - conntrack: fix crash due to confirmed bit load reordering (after
     atomic -> refcount conversion)

   - stmmac: dwc-qos: disable split header for Tegra194

  Previous releases - always broken:

   - mlx5e: ring the TX doorbell on DMA errors

   - bpf: make sure mac_header was set before using it

   - mac80211: do not wake queues on a vif that is being stopped

   - mac80211: fix queue selection for mesh/OCB interfaces

   - ip: fix dflt addr selection for connected nexthop

   - seg6: fix skb checksums for SRH encapsulation/insertion

   - xdp: fix spurious packet loss in generic XDP TX path

   - bunch of sysctl data race fixes

   - nf_log: incorrect offset to network header

  Misc:

   - bpf: add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs"

* tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  nfp: flower: configure tunnel neighbour on cmsg rx
  net/tls: Check for errors in tls_device_init
  MAINTAINERS: Add an additional maintainer to the AMD XGBE driver
  xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
  selftests/net: test nexthop without gw
  ip: fix dflt addr selection for connected nexthop
  net: atlantic: remove aq_nic_deinit() when resume
  net: atlantic: remove deep parameter on suspend/resume functions
  sfc: fix kernel panic when creating VF
  seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
  seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
  seg6: fix skb checksum evaluation in SRH encapsulation/insertion
  sfc: fix use after free when disabling sriov
  net: sunhme: output link status with a single print.
  r8152: fix accessing unset transport header
  net: stmmac: fix leaks in probe
  net: ftgmac100: Hold reference returned by of_get_child_by_name()
  nexthop: Fix data-races around nexthop_compat_mode.
  ipv4: Fix data-races around sysctl_ip_dynaddr.
  tcp: Fix a data-race around sysctl_tcp_ecn_fallback.
  ...

9bd572ec

Merge tag '5.19-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · f41d5df5

由 Linus Torvalds 提交于 7月 14, 2022

Pull cifs fixes from Steve French:
 "Three smb3 client fixes:

   - two multichannel fixes: fix a potential deadlock freeing a channel,
     and fix a race condition on failed creation of a new channel

   - mount failure fix: work around a server bug in some common older
     Samba servers by avoiding padding at the end of the negotiate
     protocol request"

* tag '5.19-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: workaround negprot bug in some Samba servers
  cifs: remove unnecessary locking of chan_lock while freeing session
  cifs: fix race condition with delayed threads

f41d5df5

Merge tag 'nfsd-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · a24a6c05

由 Linus Torvalds 提交于 7月 14, 2022

Pull nfsd fixes from Chuck Lever:
 "Notable regression fixes:

   - Enable SETATTR(time_create) to fix regression with Mac OS clients

   - Fix a lockd crasher and broken NLM UNLCK behavior"

* tag 'nfsd-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  lockd: fix nlm_close_files
  lockd: set fl_owner when unlocking files
  NFSD: Decode NFSv4 birth time attribute

a24a6c05

Merge tag 'integrity-v5.19-fix' of... · 4adfa865

由 Linus Torvalds 提交于 7月 14, 2022

Merge tag 'integrity-v5.19-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "Here are a number of fixes for recently found bugs.

  Only 'ima: fix violation measurement list record' was introduced in
  the current release. The rest address existing bugs"

* tag 'integrity-v5.19-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Fix potential memory leak in ima_init_crypto()
  ima: force signature verification when CONFIG_KEXEC_SIG is configured
  ima: Fix a potential integer overflow in ima_appraise_measurement
  ima: fix violation measurement list record
  Revert "evm: Fix memleak in init_desc"

4adfa865

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 2eb5866c

由 Linus Torvalds 提交于 7月 14, 2022

Pull ARM fixes from Russell King:

 - quieten the spectre-bhb prints

 - mark flattened device tree sections as shareable

 - remove some obsolete CPU domain code and help text

 - fix thumb unaligned access abort emulation

 - fix amba_device_add() refcount underflow

 - fix literal placement

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9208/1: entry: add .ltorg directive to keep literals in range
  ARM: 9207/1: amba: fix refcount underflow if amba_device_add() fails
  ARM: 9214/1: alignment: advance IT state after emulating Thumb instruction
  ARM: 9213/1: Print message about disabled Spectre workarounds only once
  ARM: 9212/1: domain: Modify Kconfig help text
  ARM: 9211/1: domain: drop modify_domain()
  ARM: 9210/1: Mark the FDT_FIXED sections as shareable
  ARM: 9209/1: Spectre-BHB: avoid pr_info() every time a CPU comes out of idle

2eb5866c

um: Replace to_phys() and to_virt() with less generic function names · 097da1a4

由 Guenter Roeck 提交于 7月 14, 2022

The UML function names to_virt() and to_phys() are exposed by UML
headers, and are very generic and may be defined by drivers.  As it
turns out, commit 9409c9b6 ("pmem: refactor pmem_clear_poison()")
did exactly that.

This results in build errors such as the following when trying to build
um:allmodconfig:

  drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
  ./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’
    105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
        |                    ^~~~~~~

Use less generic function names for the um specific to_phys() and
to_virt() functions to fix the problem and to avoid similar problems in
the future.

Fixes: 9409c9b6 ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

097da1a4

Merge tag 'sound-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · c4634a3c

由 Linus Torvalds 提交于 7月 14, 2022

Pull sound fixes from Takashi Iwai:
 "Hopefully the last one for 5.19. This became bigger than wished, but
  all changes are pretty device-specific small fixes, which look less
  worrisome.

  The majority of changes are about various ASoC fixes, while the usual
  HD-audio quirks are included as well"

* tag 'sound-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
  ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
  ALSA: hda/realtek: fix mute/micmute LEDs for HP machines
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
  ALSA: hda - Add fixup for Dell Latitidue E5430
  ALSA: hda/conexant: Apply quirk for another HP ProDesk 600 G3 model
  ALSA: hda/realtek: Fix headset mic for Acer SF313-51
  ASoC: Intel: Skylake: Correct the handling of fmt_config flexible array
  ASoC: Intel: Skylake: Correct the ssp rate discovery in skl_get_ssp_clks()
  ASoC: rt5640: Fix the wrong state of JD1 and JD2
  ASoC: Intel: sof_rt5682: fix out-of-bounds array access
  ASoC: qdsp6: fix potential memory leak in q6apm_get_audioreach_graph()
  ASoC: tas2764: Fix amp gain register offset & default
  ASoC: tas2764: Correct playback volume range
  ASoC: tas2764: Fix and extend FSYNC polarity handling
  ASoC: tas2764: Add post reset delays
  ASoC: dt-bindings: Fix description for msm8916
  ASoC: doc: Capitalize RESET line name
  ASoC: arizona: Update arizona_aif_cfg_changed to use RX_BCLK_RATE
  ASoC: cs47l92: Fix event generation for OUT1 demux
  ...

c4634a3c

nfp: flower: configure tunnel neighbour on cmsg rx · 656bd03a

由 Tianyu Yuan 提交于 7月 14, 2022

nfp_tun_write_neigh() function will configure a tunnel neighbour when
calling nfp_tun_neigh_event_handler() or nfp_flower_cmsg_process_one_rx()
(with no tunnel neighbour type) from firmware.

When configuring IP on physical port as a tunnel endpoint, no operation
will be performed after receiving the cmsg mentioned above.

Therefore, add a progress to configure tunnel neighbour in this case.

v2: Correct format of fixes tag.

Fixes: f1df7956 ("nfp: flower: rework tunnel neighbour configuration")
Signed-off-by: NTianyu Yuan <tianyu.yuan@corigine.com>
Reviewed-by: NLouis Peens <louis.peens@corigine.com>
Reviewed-by: NBaowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: NSimon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220714081915.148378-1-simon.horman@corigine.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

656bd03a

net/tls: Check for errors in tls_device_init · 3d8c51b2

由 Tariq Toukan 提交于 7月 14, 2022

Add missing error checks in tls_device_init.

Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
Reported-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220714070754.1428-1-tariqt@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

3d8c51b2

MAINTAINERS: Add an additional maintainer to the AMD XGBE driver · 51f1c31f

由 Tom Lendacky 提交于 7月 13, 2022

Add Shyam Sundar S K as an additional maintainer to support the AMD XGBE
network device driver.

Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/db367f24089c2bbbcd1cec8e21af49922017a110.1657751501.git.thomas.lendacky@amd.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

51f1c31f

xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue · 94e81006

由 Juergen Gross 提交于 7月 13, 2022

xenvif_rx_next_skb() is expecting the rx queue not being empty, but
in case the loop in xenvif_rx_action() is doing multiple iterations,
the availability of another skb in the rx queue is not being checked.

This can lead to crashes:

[40072.537261] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[40072.537407] IP: xenvif_rx_skb+0x23/0x590 [xen_netback]
[40072.537534] PGD 0 P4D 0
[40072.537644] Oops: 0000 [#1] SMP NOPTI
[40072.537749] CPU: 0 PID: 12505 Comm: v1-c40247-q2-gu Not tainted 4.12.14-122.121-default #1 SLE12-SP5
[40072.537867] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 11/23/2021
[40072.537999] task: ffff880433b38100 task.stack: ffffc90043d40000
[40072.538112] RIP: e030:xenvif_rx_skb+0x23/0x590 [xen_netback]
[40072.538217] RSP: e02b:ffffc90043d43de0 EFLAGS: 00010246
[40072.538319] RAX: 0000000000000000 RBX: ffffc90043cd7cd0 RCX: 00000000000000f7
[40072.538430] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffc90043d43df8
[40072.538531] RBP: 000000000000003f R08: 000077ff80000000 R09: 0000000000000008
[40072.538644] R10: 0000000000007ff0 R11: 00000000000008f6 R12: ffffc90043ce2708
[40072.538745] R13: 0000000000000000 R14: ffffc90043d43ed0 R15: ffff88043ea748c0
[40072.538861] FS: 0000000000000000(0000) GS:ffff880484600000(0000) knlGS:0000000000000000
[40072.538988] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[40072.539088] CR2: 0000000000000080 CR3: 0000000407ac8000 CR4: 0000000000040660
[40072.539211] Call Trace:
[40072.539319] xenvif_rx_action+0x71/0x90 [xen_netback]
[40072.539429] xenvif_kthread_guest_rx+0x14a/0x29c [xen_netback]

Fix that by stopping the loop in case the rx queue becomes empty.

Cc: stable@vger.kernel.org
Fixes: 98f6d57c ("xen-netback: process guest rx packets in batches")
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NJan Beulich <jbeulich@suse.com>
Reviewed-by: NPaul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20220713135322.19616-1-jgross@suse.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

94e81006

amdgpu: disable powerpc support for the newer display engine · d11219ad

由 Linus Torvalds 提交于 7月 13, 2022

The DRM_AMD_DC_DCN display engine support (Raven, Navi, and newer) has
not been building cleanly on powerpc and causes link errors due to
mixing hard- and soft-float object files:

powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float
powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
[..]

and while patches are floating around, it's not exactly obvious what is
going on.

The problem bisects to commit 41b7a347 ("powerpc: Book3S 64-bit
outline-only KASAN support") but that is probably more about changing
config variables than the fundamental cause.

Despite the bisection result, a more directly related commit seems to be
26f4712a ("drm/amd/display: move FPU related code from dcn31 to
dml/dcn31 folder"). It's probably a combination of the two.

This has been going on since the merge window, without any final word.
So instead of blindly applying patches that may or may not be the right
thing, let's disable this for now.

As Michael Ellerman says:
"IIUIC this code was never enabled on ppc before, so disabling it seems
like a reasonable fix to get the build clean"

and once we have more actual feedback (and find any potential users) we
can always re-enable it with the patch that fixes the issues and
back-port as necessary.

Fixes: 41b7a347 ("powerpc: Book3S 64-bit outline-only KASAN support")
Fixes: 26f4712a ("drm/amd/display: move FPU related code from dcn31 to dml/dcn31 folder")
Reported-and-tested-by: NGuenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/20220606153910.GA1773067@roeck-us.net/
Link: https://lore.kernel.org/all/20220618232737.2036722-1-linux@roeck-us.net/
Link: https://lore.kernel.org/all/20220713050724.GA2471738@roeck-us.net/Acked-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NAlex Deucher <alexdeucher@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d11219ad

14 7月, 2022 3 次提交

selftests/net: test nexthop without gw · cd72e61b

由 Nicolas Dichtel 提交于 7月 13, 2022

This test implement the scenario described in the commit
"ip: fix dflt addr selection for connected nexthop".
The test configures a nexthop object with an output device only (no gateway
address) and a route that uses this nexthop. The goal is to check if the
kernel selects a valid source address.

Link: https://lore.kernel.org/netdev/20220712095545.10947-1-nicolas.dichtel@6wind.com/Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-2-nicolas.dichtel@6wind.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

cd72e61b

ip: fix dflt addr selection for connected nexthop · 747c1430

由 Nicolas Dichtel 提交于 7月 13, 2022

When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.

When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.

To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.

When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.

Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1

ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0

ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 up
ip -n bar link set veth1 up

ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1

ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument

Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
>     cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms

CC: stable@vger.kernel.org
Fixes: 597cfe4f ("nexthop: Add support for IPv4 nexthops")
Reported-by: NEdwin Brossette <edwin.brossette@6wind.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-1-nicolas.dichtel@6wind.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

747c1430

ARM: 9208/1: entry: add .ltorg directive to keep literals in range · 29589ca0

由 Ard Biesheuvel 提交于 5月 31, 2022

LKP reports a build issue on Clang, related to a literal load of
__current issued through the ldr_va macro. This turns out to be due to
the fact that group relocations are disabled when CONFIG_COMPILE_TEST=y,
which means that the ldr_va macro resolves to a pair of LDR
instructions, the first one being a literal load issued too far from its
literal pool.

Due to the introduction of a couple of new uses of this macro in commit
50807460 ("ARM: 9195/1: entry: avoid explicit literal loads"),
the literal pools end up getting rearranged in a way that causes the
literal for __current to go out of range. Let's fix this up by putting a
.ltorg directive in a suitable place in the code.

Link: https://lore.kernel.org/all/202205290805.1vZLAr36-lkp@intel.com/

Fixes: 50807460 ("ARM: 9195/1: entry: avoid explicit literal loads")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>

29589ca0

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功