提交 · 90cf87d16bd566cff40c2bc8e32e6d4cd3af23f0 · openeuler / Kernel

26 11月, 2020 4 次提交

enetc: Let the hardware auto-advance the taprio base-time of 0 · 90cf87d1

由 Vladimir Oltean 提交于 11月 25, 2020

The tc-taprio base time indicates the beginning of the tc-taprio
schedule, which is cyclic by definition (where the length of the cycle
in nanoseconds is called the cycle time). The base time is a 64-bit PTP
time in the TAI domain.

Logically, the base-time should be a future time. But that imposes some
restrictions to user space, which has to retrieve the current PTP time
from the NIC first, then calculate a base time that will still be larger
than the base time by the time the kernel driver programs this value
into the hardware. Actually ensuring that the programmed base time is in
the future is still a problem even if the kernel alone deals with this.

Luckily, the enetc hardware already advances a base-time that is in the
past into a congruent time in the immediate future, according to the
same formula that can be found in the software implementation of taprio
(in taprio_get_start_time):

	/* Schedule the start time for the beginning of the next
	 * cycle.
	 */
	n = div64_s64(ktime_sub_ns(now, base), cycle);
	*start = ktime_add_ns(base, (n + 1) * cycle);

There's only one problem: the driver doesn't let the hardware do that.
It interferes with the base-time passed from user space, by special-casing
the situation when the base-time is zero, and replaces that with the
current PTP time. This changes the intended effective base-time of the
schedule, which will in the end have a different phase offset than if
the base-time of 0.000000000 was to be advanced by an integer multiple
of the cycle-time.

Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201124220259.3027991-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

90cf87d1

gro_cells: reduce number of synchronize_net() calls · 2543a600

由 Eric Dumazet 提交于 11月 24, 2020

After cited commit, gro_cells_destroy() became damn slow
on hosts with a lot of cores.

This is because we have one additional synchronize_net() per cpu as
stated in the changelog.

gro_cells_init() is setting NAPI_STATE_NO_BUSY_POLL, and this was enough
to not have one synchronize_net() call per netif_napi_del()

We can factorize all the synchronize_net() to a single one,
right before freeing per-cpu memory.

Fixes: 5198d545 ("net: remove napi_hash_del() from driver-facing API")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201124203822.1360107-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

2543a600

net: stmmac: fix incorrect merge of patch upstream · 12a8fe56

由 Antonio Borneo 提交于 11月 24, 2020

Commit 75792624 ("net: stmmac: add flexible PPS to dwmac
4.10a") was intended to modify the struct dwmac410_ops, but it got
somehow badly merged and modified the struct dwmac4_ops.

Revert the modification in struct dwmac4_ops and re-apply it
properly in struct dwmac410_ops.

Fixes: 75792624 ("net: stmmac: add flexible PPS to dwmac 4.10a")
Signed-off-by: NAntonio Borneo <antonio.borneo@st.com>
Reported-by: NAhmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20201124223729.886992-1-antonio.borneo@st.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

12a8fe56

ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init · e255e11e

由 Wang Hai 提交于 11月 24, 2020

kmemleak report a memory leak as follows:

unreferenced object 0xffff8880059c6a00 (size 64):
  comm "ip", pid 23696, jiffies 4296590183 (age 1755.384s)
  hex dump (first 32 bytes):
    20 01 00 10 00 00 00 00 00 00 00 00 00 00 00 00   ...............
    1c 00 00 00 00 00 00 00 00 00 00 00 07 00 00 00  ................
  backtrace:
    [<00000000aa4e7a87>] ip6addrlbl_add+0x90/0xbb0
    [<0000000070b8d7f1>] ip6addrlbl_net_init+0x109/0x170
    [<000000006a9ca9d4>] ops_init+0xa8/0x3c0
    [<000000002da57bf2>] setup_net+0x2de/0x7e0
    [<000000004e52d573>] copy_net_ns+0x27d/0x530
    [<00000000b07ae2b4>] create_new_namespaces+0x382/0xa30
    [<000000003b76d36f>] unshare_nsproxy_namespaces+0xa1/0x1d0
    [<0000000030653721>] ksys_unshare+0x3a4/0x780
    [<0000000007e82e40>] __x64_sys_unshare+0x2d/0x40
    [<0000000031a10c08>] do_syscall_64+0x33/0x40
    [<0000000099df30e7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

We should free all rules when we catch an error in ip6addrlbl_net_init().
otherwise a memory leak will occur.

Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Link: https://lore.kernel.org/r/20201124071728.8385-1-wanghai38@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

e255e11e

25 11月, 2020 16 次提交

Documentation: netdev-FAQ: suggest how to post co-dependent series · 6f7a1f9c

由 Jakub Kicinski 提交于 11月 24, 2020

Make an explicit suggestion how to post user space side of kernel
patches to avoid reposts when patchwork groups the wrong patches.

v2: mention the cases unlike iproute2 explicitly
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f7a1f9c

Merge tag 'batadv-net-pullrequest-20201124' of git://git.open-mesh.org/linux-merge · 26c89965

由 Jakub Kicinski 提交于 11月 24, 2020

Simon Wunderlich says:

====================
Here is a batman-adv bugfix:

 - set module owner to THIS_MODULE, by Taehee Yoo

* tag 'batadv-net-pullrequest-20201124' of git://git.open-mesh.org/linux-merge:
  batman-adv: set .owner to THIS_MODULE
====================

Link: https://lore.kernel.org/r/20201124134417.17269-1-sw@simonwunderlich.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>

26c89965

Merge branch 'ibmvnic-null-pointer-dereference' · 49d66ed8

由 Jakub Kicinski 提交于 11月 24, 2020

Lijun Pan says:

====================
ibmvnic: null pointer dereference

Fix two NULL pointer dereference crash issues.
Improve module removal procedure.
====================

Link: https://lore.kernel.org/r/20201123193547.57225-1-ljp@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

49d66ed8

ibmvnic: enhance resetting status check during module exit · 3ada2881

由 Lijun Pan 提交于 11月 23, 2020

Based on the discussion with Sukadev Bhattiprolu and Dany Madden,
we believe that checking adapter->resetting bit is preferred
since RESETTING state flag is not as strict as resetting bit.
RESETTING state flag is removed since it is verbose now.

Fixes: 7d7195a0 ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3ada2881

ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq · 0e435bef

由 Lijun Pan 提交于 11月 23, 2020

crq->msgs could be NULL if the previous reset did not complete after
freeing crq->msgs. Check for NULL before dereferencing them.

Snippet of call trace:
...
ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ
ibmvnic 30000003 env3 (unregistering): Releasing CRQ
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0000000000c1a30
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic]
CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G            E     5.10.0-rc1+ #12
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400
REGS: c00000000d05b7a0 TRAP: 0380   Tainted: G            E      (5.10.0-rc1+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002480  XER: 20040000
CFAR: c0000000000c19ec IRQMASK: 0
GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2
GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280
GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002
GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550
GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800
NIP [c0000000000c1a30] memset+0x68/0x104
LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic]
Call Trace:
[c00000000d05ba30] [0000000000000800] 0x800 (unreliable)
[c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic]
[c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic]
[c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800
[c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520
[c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0
[c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c

Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0e435bef

ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues · a0faaa27

由 Lijun Pan 提交于 11月 23, 2020

adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
did not complete after freeing sub crqs. Check for NULL before
dereferencing them.

Snippet of call trace:
ibmvnic 30000006 env6: Releasing sub-CRQ
ibmvnic 30000006 env6: Releasing CRQ
...
ibmvnic 30000006 env6: Got Control IP offload Response
ibmvnic 30000006 env6: Re-setting tx_scrq[0]
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc008000003dea7cc
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G        W         5.8.0+ #4
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000
REGS: c0000007ef7db860 TRAP: 0380   Tainted: G        W          (5.8.0+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28002422  XER: 0000000d
CFAR: c000000000bd9520 IRQMASK: 0
GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00
GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010
GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7
GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048
GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00
NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic]
LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic]
Call Trace:
[c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable)
[c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic]
[c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800
[c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520
[c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0
[c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74

Fixes: 57a49436 ("ibmvnic: Reset sub-crqs during driver reset")
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

a0faaa27

Merge branch 'fixes-for-ena-driver' · 5fc145f1

由 Jakub Kicinski 提交于 11月 24, 2020

Shay Agroskin says:

====================
Fixes for ENA driver

- fix wrong data offset on machines that support rx offset
- work-around Intel iommu issue
- fix out of bound access when request id is wrong
====================

Link: https://lore.kernel.org/r/20201123190859.21298-1-shayagr@amazon.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5fc145f1

net: ena: fix packet's addresses for rx_offset feature · 1396d314

由 Shay Agroskin 提交于 11月 23, 2020

This patch fixes two lines in which the rx_offset received by the device
wasn't taken into account:

- prefetch function:
	In our driver the copied data would reside in
	rx_info->page + rx_headroom + rx_offset

	so the prefetch function is changed accordingly.

- setting page_offset to zero for descriptors > 1:
	for every descriptor but the first, the rx_offset is zero. Hence
	the page_offset value should be set to rx_headroom.

	The previous implementation changed the value of rx_info after
	the descriptor was added to the SKB (essentially providing wrong
	page offset).

Fixes: 68f236df ("net: ena: add support for the rx offset feature")
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1396d314

net: ena: set initial DMA width to avoid intel iommu issue · 09323b3b

由 Shay Agroskin 提交于 11月 23, 2020

The ENA driver uses the readless mechanism, which uses DMA, to find
out what the DMA mask is supposed to be.

If DMA is used without setting the dma_mask first, it causes the
Intel IOMMU driver to think that ENA is a 32-bit device and therefore
disables IOMMU passthrough permanently.

This patch sets the dma_mask to be ENA_MAX_PHYS_ADDR_SIZE_BITS=48
before readless initialization in
ena_device_init()->ena_com_mmio_reg_read_request_init(),
which is large enough to workaround the intel_iommu issue.

DMA mask is set again to the correct value after it's received from the
device after readless is initialized.

The patch also changes the driver to use dma_set_mask_and_coherent()
function instead of the two pci_set_dma_mask() and
pci_set_consistent_dma_mask() ones. Both methods achieve the same
effect.

Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NMike Cui <mikecui@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

09323b3b

net: ena: handle bad request id in ena_netdev · 5b7022cf

由 Shay Agroskin 提交于 11月 23, 2020

After request id is checked in validate_rx_req_id() its value is still
used in the line
	rx_ring->free_ids[next_to_clean] =
					rx_ring->ena_bufs[i].req_id;
even if it was found to be out-of-bound for the array free_ids.

The patch moves the request id to an earlier stage in the napi routine and
makes sure its value isn't used if it's found out-of-bounds.

Fixes: 30623e1e ("net: ena: avoid memory access violation by validating req_id properly")
Signed-off-by: NIdo Segev <idose@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5b7022cf

nfc: s3fwrn5: use signed integer for parsing GPIO numbers · d8f0a867

由 Krzysztof Kozlowski 提交于 11月 23, 2020

GPIOs - as returned by of_get_named_gpio() and used by the gpiolib - are
signed integers, where negative number indicates error. The return
value of of_get_named_gpio() should not be assigned to an unsigned int
because in case of !CONFIG_GPIOLIB such number would be a valid GPIO.

Fixes: c04c674f ("nfc: s3fwrn5: Add driver for Samsung S3FWRN5 NFC Chip")
Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201123162351.209100-1-krzk@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d8f0a867

dpaa2-eth: Fix compile error due to missing devlink support · 078eb55c

由 Ezequiel Garcia 提交于 11月 23, 2020

The dpaa2 driver depends on devlink, so it should select
NET_DEVLINK in order to fix compile errors, such as:

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.o: in function `dpaa2_eth_rx_err':
dpaa2-eth.c:(.text+0x3cec): undefined reference to `devlink_trap_report'
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-devlink.o: in function `dpaa2_eth_dl_info_get':
dpaa2-eth-devlink.c:(.text+0x160): undefined reference to `devlink_info_driver_name_put'

Fixes: ceeb03ad ("dpaa2-eth: add basic devlink support")
Signed-off-by: NEzequiel Garcia <ezequiel@collabora.com>
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20201123163553.1666476-1-ciorneiioana@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

078eb55c

MAINTAINERS: Update page pool entry · bc40a369

由 Jesper Dangaard Brouer 提交于 11月 23, 2020

Add some file F: matches that is related to page_pool.
Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/160613894639.2826716.14635284017814375894.stgit@firesoulSigned-off-by: NJakub Kicinski <kuba@kernel.org>

bc40a369

tcp: Set ECT0 bit in tos/tclass for synack when BPF needs ECN · 407c85c7

由 Alexander Duyck 提交于 11月 20, 2020

When a BPF program is used to select between a type of TCP congestion
control algorithm that uses either ECN or not there is a case where the
synack for the frame was coming up without the ECT0 bit set. A bit of
research found that this was due to the final socket being configured to
dctcp while the listener socket was staying in cubic.

To reproduce it all that is needed is to monitor TCP traffic while running
the sample bpf program "samples/bpf/tcp_cong_kern.c". What is observed,
assuming tcp_dctcp module is loaded or compiled in and the traffic matches
the rules in the sample file, is that for all frames with the exception of
the synack the ECT0 bit is set.

To address that it is necessary to make one additional call to
tcp_bpf_ca_needs_ecn using the request socket and then use the output of
that to set the ECT0 bit for the tos/tclass of the packet.

Fixes: 91b5b21c ("bpf: Add support for changing congestion control")
Signed-off-by: NAlexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/160593039663.2604.1374502006916871573.stgit@localhost.localdomainSigned-off-by: NJakub Kicinski <kuba@kernel.org>

407c85c7

devlink: Fix reload stats structure · 5204bb68

由 Moshe Shemesh 提交于 11月 23, 2020

Fix reload stats structure exposed to the user. Change stats structure
hierarchy to have the reload action as a parent of the stat entry and
then stat entry includes value per limit. This will also help to avoid
string concatenation on iproute2 output.

Reload stats structure before this fix:
"stats": {
    "reload": {
        "driver_reinit": 2,
        "fw_activate": 1,
        "fw_activate_no_reset": 0
     }
}

After this fix:
"stats": {
    "reload": {
        "driver_reinit": {
            "unspecified": 2
        },
        "fw_activate": {
            "unspecified": 1,
            "no_reset": 0
        }
}

Fixes: a254c264 ("devlink: Add reload stats")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1606109785-25197-1-git-send-email-moshe@mellanox.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5204bb68

aquantia: Remove the build_skb path · 9bd2702d

由 Lincoln Ramsay 提交于 11月 23, 2020

When performing IPv6 forwarding, there is an expectation that SKBs
will have some headroom. When forwarding a packet from the aquantia
driver, this does not always happen, triggering a kernel warning.

aq_ring.c has this code (edited slightly for brevity):

if (buff->is_eop && buff->len <= AQ_CFG_RX_FRAME_MAX - AQ_SKB_ALIGN) {
    skb = build_skb(aq_buf_vaddr(&buff->rxdata), AQ_CFG_RX_FRAME_MAX);
} else {
    skb = napi_alloc_skb(napi, AQ_CFG_RX_HDR_SIZE);

There is a significant difference between the SKB produced by these
2 code paths. When napi_alloc_skb creates an SKB, there is a certain
amount of headroom reserved. However, this is not done in the
build_skb codepath.

As the hardware buffer that build_skb is built around does not
handle the presence of the SKB header, this code path is being
removed and the napi_alloc_skb path will always be used. This code
path does have to copy the packet header into the SKB, but it adds
the packet data as a frag.

Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: NLincoln Ramsay <lincoln.ramsay@opengear.com>
Link: https://lore.kernel.org/r/MWHPR1001MB23184F3EAFA413E0D1910EC9E8FC0@MWHPR1001MB2318.namprd10.prod.outlook.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9bd2702d

24 11月, 2020 5 次提交

net/packet: fix packet receive on L3 devices without visible hard header · d5496990

由 Eyal Birger 提交于 11月 21, 2020

In the patchset merged by commit b9fcf0a0
("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which
did not have header_ops were given one for the purpose of protocol parsing
on af_packet transmit path.

That change made af_packet receive path regard these devices as having a
visible L3 header and therefore aligned incoming skb->data to point to the
skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not
reset their mac_header prior to ingress and therefore their incoming
packets became malformed.

Ideally these devices would reset their mac headers, or af_packet would be
able to rely on dev->hard_header_len being 0 for such cases, but it seems
this is not the case.

Fix by changing af_packet RX ll visibility criteria to include the
existence of a '.create()' header operation, which is used when creating
a device hard header - via dev_hard_header() - by upper layers, and does
not exist in these L3 devices.

As this predicate may be useful in other situations, add it as a common
dev_has_header() helper in netdevice.h.

Fixes: b9fcf0a0 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'")
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20201121062817.3178900-1-eyal.birger@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d5496990

i40e: Fix removing driver while bare-metal VFs pass traffic · 2980cbd4

由 Sylwester Dziedziuch 提交于 11月 20, 2020

Prevent VFs from resetting when PF driver is being unloaded:
- introduce new pf state: __I40E_VF_RESETS_DISABLED;
- check if pf state has __I40E_VF_RESETS_DISABLED state set,
  if so, disable any further VFLR event notifications;
- when i40e_remove (rmmod i40e) is called, disable any resets on
  the VFs;

Previously if there were bare-metal VFs passing traffic and PF
driver was removed, there was a possibility of VFs triggering a Tx
timeout right before iavf_remove. This was causing iavf_close to
not be called because there is a check in the beginning of  iavf_remove
that bails out early if adapter->state < IAVF_DOWN_PENDING. This
makes it so some resources do not get cleaned up.

Fixes: 6a9ddb36 ("i40e: disable IOV before freeing resources")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20201120180640.3654474-1-anthony.l.nguyen@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

2980cbd4

vsock/virtio: discard packets only when socket is really closed · 3fe356d5

由 Stefano Garzarella 提交于 11月 20, 2020

Starting from commit 8692cefc ("virtio_vsock: Fix race condition
in virtio_transport_recv_pkt"), we discard packets in
virtio_transport_recv_pkt() if the socket has been released.

When the socket is connected, we schedule a delayed work to wait the
RST packet from the other peer, also if SHUTDOWN_MASK is set in
sk->sk_shutdown.
This is done to complete the virtio-vsock shutdown algorithm, releasing
the port assigned to the socket definitively only when the other peer
has consumed all the packets.

If we discard the RST packet received, the socket will be closed only
when the VSOCK_CLOSE_TIMEOUT is reached.

Sergio discovered the issue while running ab(1) HTTP benchmark using
libkrun [1] and observing a latency increase with that commit.

To avoid this issue, we discard packet only if the socket is really
closed (SOCK_DONE flag is set).
We also set SOCK_DONE in virtio_transport_release() when we don't need
to wait any packets from the other peer (we didn't schedule the delayed
work). In this case we remove the socket from the vsock lists, releasing
the port assigned.

[1] https://github.com/containers/libkrun

Fixes: 8692cefc ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt")
Cc: justin.he@arm.com
Reported-by: NSergio Lopez <slp@redhat.com>
Tested-by: NSergio Lopez <slp@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Acked-by: NJia He <justin.he@arm.com>
Link: https://lore.kernel.org/r/20201120104736.73749-1-sgarzare@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

3fe356d5

tcp: fix race condition when creating child sockets from syncookies · 01770a16

由 Ricardo Dias 提交于 11月 20, 2020

When the TCP stack is in SYN flood mode, the server child socket is
created from the SYN cookie received in a TCP packet with the ACK flag
set.

The child socket is created when the server receives the first TCP
packet with a valid SYN cookie from the client. Usually, this packet
corresponds to the final step of the TCP 3-way handshake, the ACK
packet. But is also possible to receive a valid SYN cookie from the
first TCP data packet sent by the client, and thus create a child socket
from that SYN cookie.

Since a client socket is ready to send data as soon as it receives the
SYN+ACK packet from the server, the client can send the ACK packet (sent
by the TCP stack code), and the first data packet (sent by the userspace
program) almost at the same time, and thus the server will equally
receive the two TCP packets with valid SYN cookies almost at the same
instant.

When such event happens, the TCP stack code has a race condition that
occurs between the momement a lookup is done to the established
connections hashtable to check for the existence of a connection for the
same client, and the moment that the child socket is added to the
established connections hashtable. As a consequence, this race condition
can lead to a situation where we add two child sockets to the
established connections hashtable and deliver two sockets to the
userspace program to the same client.

This patch fixes the race condition by checking if an existing child
socket exists for the same client when we are adding the second child
socket to the established connections socket. If an existing child
socket exists, we drop the packet and discard the second child socket
to the same client.
Signed-off-by: NRicardo Dias <rdias@singlestore.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201120111133.GA67501@rdias-suse-pc.lanSigned-off-by: NJakub Kicinski <kuba@kernel.org>

01770a16

Merge tag 'wireless-drivers-2020-11-23' of... · 1eae77bf

由 Jakub Kicinski 提交于 11月 23, 2020

Merge tag 'wireless-drivers-2020-11-23' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for v5.10

First set of fixes for v5.10. One fix for iwlwifi kernel panic, others
less notable.

rtw88

* fix a bogus test found by clang

iwlwifi

* fix long memory reads causing soft lockup warnings

* fix kernel panic during Channel Switch Announcement (CSA)

* other smaller fixes

MAINTAINERS

* email address updates

* tag 'wireless-drivers-2020-11-23' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers:
  iwlwifi: mvm: fix kernel panic in case of assert during CSA
  iwlwifi: pcie: set LTR to avoid completion timeout
  iwlwifi: mvm: write queue_sync_state only for sync
  iwlwifi: mvm: properly cancel a session protection for P2P
  iwlwifi: mvm: use the HOT_SPOT_CMD to cancel an AUX ROC
  iwlwifi: sta: set max HE max A-MPDU according to HE capa
  MAINTAINERS: update maintainers list for Cypress
  MAINTAINERS: update Yan-Hsuan's email address
  iwlwifi: pcie: limit memory read spin time
  rtw88: fix fw_fifo_addr check
====================

Link: https://lore.kernel.org/r/20201123161037.C11D1C43460@smtp.codeaurora.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

1eae77bf

22 11月, 2020 9 次提交

Merge branch 'ibmvnic-fixes-in-reset-path' · f9b03653

由 Jakub Kicinski 提交于 11月 21, 2020

Lijun Pan says:

====================
ibmvnic: fixes in reset path

Patch 1/3 and 2/3 notify peers in failover and migration reset.
Patch 3/3 skips timeout reset if it is already resetting.
====================

Link: https://lore.kernel.org/r/20201120224013.46891-1-ljp@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

f9b03653

ibmvnic: skip tx timeout reset while in resetting · 855a631a

由 Lijun Pan 提交于 11月 20, 2020

Sometimes it takes longer than 5 seconds (watchdog timeout) to complete
failover, migration, and other resets. In stead of scheduling another
timeout reset, we wait for the current one to complete.
Suggested-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

855a631a

ibmvnic: notify peers when failover and migration happen · 98025bce

由 Lijun Pan 提交于 11月 20, 2020

Commit 61d3e1d9 ("ibmvnic: Remove netdev notify for failover resets")
excluded the failover case for notify call because it said
netdev_notify_peers() can cause network traffic to stall or halt.
Current testing does not show network traffic stall
or halt because of the notify call for failover event.
netdev_notify_peers may be used when a device wants to inform the
rest of the network about some sort of a reconfiguration
such as failover or migration.

It is unnecessary to call that in other events like
FATAL, NON_FATAL, CHANGE_PARAM, and TIMEOUT resets
since in those scenarios the hardware does not change.
If the driver must do a hard reset, it is necessary to notify peers.

Fixes: 61d3e1d9 ("ibmvnic: Remove netdev notify for failover resets")
Suggested-by: NBrian King <brking@linux.vnet.ibm.com>
Suggested-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

98025bce

ibmvnic: fix call_netdevice_notifiers in do_reset · 83935975

由 Lijun Pan 提交于 11月 20, 2020

When netdev_notify_peers was substituted in
commit 986103e7 ("net/ibmvnic: Fix RTNL deadlock during device reset"),
call_netdevice_notifiers(NETDEV_RESEND_IGMP, dev) was missed.
Fix it now.

Fixes: 986103e7 ("net/ibmvnic: Fix RTNL deadlock during device reset")
Signed-off-by: NLijun Pan <ljp@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

83935975

tun: honor IOCB_NOWAIT flag · 5aac0390

由 Jens Axboe 提交于 11月 20, 2020

tun only checks the file O_NONBLOCK flag, but it should also be checking
the iocb IOCB_NOWAIT flag. Any fops using ->read/write_iter() should check
both, otherwise it breaks users that correctly expect O_NONBLOCK semantics
if IOCB_NOWAIT is set.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/e9451860-96cc-c7c7-47b8-fe42cadd5f4c@kernel.dkSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5aac0390

net/af_iucv: set correct sk_protocol for child sockets · c5dab094

由 Julian Wiedmann 提交于 11月 20, 2020

Child sockets erroneously inherit their parent's sk_type (ie. SOCK_*),
instead of the PF_IUCV protocol that the parent was created with in
iucv_sock_create().

We're currently not using sk->sk_protocol ourselves, so this shouldn't
have much impact (except eg. getting the output in skb_dump() right).

Fixes: eac3731b ("[S390]: Add AF_IUCV socket support")
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Link: https://lore.kernel.org/r/20201120100657.34407-1-jwi@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c5dab094

usbnet: ipheth: fix connectivity with iOS 14 · f33d9e2b

由 Yves-Alexis Perez 提交于 11月 19, 2020

Starting with iOS 14 released in September 2020, connectivity using the
personal hotspot USB tethering function of iOS devices is broken.

Communication between the host and the device (for example ICMP traffic
or DNS resolution using the DNS service running in the device itself)
works fine, but communication to endpoints further away doesn't work.

Investigation on the matter shows that no UDP and ICMP traffic from the
tethered host is reaching the Internet at all. For TCP traffic there are
exchanges between tethered host and server but packets are modified in
transit leading to impossible communication.

After some trials Matti Vuorela discovered that reducing the URB buffer
size by two bytes restored the previous behavior. While a better
solution might exist to fix the issue, since the protocol is not
publicly documented and considering the small size of the fix, let's do
that.
Tested-by: NMatti Vuorela <matti.vuorela@bitfactor.fi>
Signed-off-by: NYves-Alexis Perez <corsac@corsac.net>
Link: https://lore.kernel.org/linux-usb/CAAn0qaXmysJ9vx3ZEMkViv_B19ju-_ExN8Yn_uSefxpjS6g4Lw@mail.gmail.com/
Link: https://github.com/libimobiledevice/libimobiledevice/issues/1038
Link: https://lore.kernel.org/r/20201119172439.94988-1-corsac@corsac.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>

f33d9e2b

cxgb4: Fix build failure when CONFIG_TLS=m · 659fbdcf

由 Tom Seewald 提交于 11月 20, 2020

After commit 9d2e5e9e ("cxgb4/ch_ktls: decrypted bit is not enough")
whenever CONFIG_TLS=m and CONFIG_CHELSIO_T4=y, the following build
failure occurs:

ld: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.o: in function
`cxgb_select_queue':
cxgb4_main.c:(.text+0x2dac): undefined reference to `tls_validate_xmit_skb'

Fix this by ensuring that if TLS is set to be a module, CHELSIO_T4 will
also be compiled as a module. As otherwise the cxgb4 driver will not be
able to access TLS' symbols.

Fixes: 9d2e5e9e ("cxgb4/ch_ktls: decrypted bit is not enough")
Signed-off-by: NTom Seewald <tseewald@gmail.com>
Link: https://lore.kernel.org/r/20201120192528.615-1-tseewald@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

659fbdcf

bonding: wait for sysfs kobject destruction before freeing struct slave · b9ad3e9f

由 Jamie Iles 提交于 11月 20, 2020

syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a
struct slave device could result in the following splat:

  kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000)
  bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline]
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600
  WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S                5.9.0-rc8+ #96
  Hardware name: linux,dummy-virt (DT)
  Workqueue: netns cleanup_net
  Call trace:
   dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239
   show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x174/0x1f8 lib/dump_stack.c:118
   panic+0x360/0x7a0 kernel/panic.c:231
   __warn+0x244/0x2ec kernel/panic.c:600
   report_bug+0x240/0x398 lib/bug.c:198
   bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974
   call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322
   brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329
   do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864
   el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65
   el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93
   el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594
   debug_print_object+0x180/0x240 lib/debugobjects.c:485
   __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
   debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998
   slab_free_hook mm/slub.c:1536 [inline]
   slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577
   slab_free mm/slub.c:3138 [inline]
   kfree+0x13c/0x460 mm/slub.c:4119
   bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492
   __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190
   bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline]
   bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420
   notifier_call_chain+0xf0/0x200 kernel/notifier.c:83
   __raw_notifier_call_chain kernel/notifier.c:361 [inline]
   raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368
   call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033
   call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
   call_netdevice_notifiers net/core/dev.c:2059 [inline]
   rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347
   unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509
   unregister_netdevice_many net/core/dev.c:10508 [inline]
   default_device_exit_batch+0x294/0x338 net/core/dev.c:10992
   ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189
   cleanup_net+0x44c/0x888 net/core/net_namespace.c:603
   process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269
   worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415
   kthread+0x390/0x498 kernel/kthread.c:292
   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925

This is a potential use-after-free if the sysfs nodes are being accessed
whilst removing the struct slave, so wait for the object destruction to
complete before freeing the struct slave itself.

Fixes: 07699f9a ("bonding: add sysfs /slave dir for bond slave devices.")
Fixes: a068aab4 ("bonding: Fix reference count leak in bond_sysfs_slave_add.")
Cc: Qiushi Wu <wu000273@umn.edu>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: NJamie Iles <jamie@nuviainc.com>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b9ad3e9f

21 11月, 2020 6 次提交

Merge branch 's390-qeth-fixes-2020-11-20' · 207d0bfc

由 Jakub Kicinski 提交于 11月 20, 2020

Julian Wiedmann says:

====================
s390/qeth: fixes 2020-11-20

This brings several fixes for qeth's af_iucv-specific code paths.

Also one fix by Alexandra for the recently added BR_LEARNING_SYNC
support. We want to trust the feature indication bit, so that HW can
mask it out if there's any issues on their end.
====================

Link: https://lore.kernel.org/r/20201120090939.101406-1-jwi@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

207d0bfc

s390/qeth: fix tear down of async TX buffers · 7ed10e16

由 Julian Wiedmann 提交于 11月 20, 2020

When qeth_iqd_tx_complete() detects that a TX buffer requires additional
async completion via QAOB, it might fail to replace the queue entry's
metadata (and ends up triggering recovery).

Assume now that the device gets torn down, overruling the recovery.
If the QAOB notification then arrives before the tear down has
sufficiently progressed, the buffer state is changed to
QETH_QDIO_BUF_HANDLED_DELAYED by qeth_qdio_handle_aob().

The tear down code calls qeth_drain_output_queue(), where
qeth_cleanup_handled_pending() will then attempt to replace such a
buffer _again_. If it succeeds this time, the buffer ends up dangling in
its replacement's ->next_pending list ... where it will never be freed,
since there's no further call to qeth_cleanup_handled_pending().

But the second attempt isn't actually needed, we can simply leave the
buffer on the queue and re-use it after a potential recovery has
completed. The qeth_clear_output_buffer() in qeth_drain_output_queue()
will ensure that it's in a clean state again.

Fixes: 72861ae7 ("qeth: recovery through asynchronous delivery")
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

7ed10e16

s390/qeth: fix af_iucv notification race · 8908f36d

由 Julian Wiedmann 提交于 11月 20, 2020

The two expected notification sequences are
1. TX_NOTIFY_PENDING with a subsequent TX_NOTIFY_DELAYED_*, when
   our TX completion code first observed the pending TX and the QAOB
   then completes at a later time; or
2. TX_NOTIFY_OK, when qeth_qdio_handle_aob() picked up the QAOB
   completion before our TX completion code even noticed that the TX
   was pending.

But as qeth_iqd_tx_complete() and qeth_qdio_handle_aob() can run
concurrently, we may end up with a race that results in a sequence of
TX_NOTIFY_DELAYED_* followed by TX_NOTIFY_PENDING. Which would confuse
the af_iucv code in its tracking of pending transmits.

Rework the notification code, so that qeth_qdio_handle_aob() defers its
notification if the TX completion code is still active.

Fixes: b3332930 ("qeth: add support for af_iucv HiperSockets transport")
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8908f36d

s390/qeth: make af_iucv TX notification call more robust · 34c7f50f

由 Julian Wiedmann 提交于 11月 20, 2020

Calling into socket code is ugly already, at least check whether we are
dealing with the expected sk_family. Only looking at skb->protocol is
bound to cause troubles (consider eg. af_packet).

Fixes: b3332930 ("qeth: add support for af_iucv HiperSockets transport")
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

34c7f50f

s390/qeth: Remove pnso workaround · 0d0e2b53

由 Alexandra Winter 提交于 11月 20, 2020

Remove workaround that supported early hardware implementations
of PNSO OC3. Rely on the 'enarf' feature bit instead.

Fixes: fa115adf ("s390/qeth: Detect PNSO OC3 capability")
Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com>
Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com>
[jwi: use logical instead of bit-wise AND]
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0d0e2b53

Merge branch 'tcp-address-issues-with-ect0-not-being-set-in-dctcp-packets' · e10823c7

由 Jakub Kicinski 提交于 11月 20, 2020

Alexander Duyck says:

====================
tcp: Address issues with ECT0 not being set in DCTCP packets

This patch set is meant to address issues seen with SYN/ACK packets not
containing the ECT0 bit when DCTCP is configured as the congestion control
algorithm for a TCP socket.

A simple test using "tcpdump" and "test_progs -t bpf_tcp_ca" makes the
issue obvious. Looking at the packets will result in the SYN/ACK packet
with an ECT0 bit that does not match the other packets for the flow when
the congestion control agorithm is switch from the default. So for example
going from non-DCTCP to a DCTCP congestion control algorithm we will see
the SYN/ACK IPV6 header will not have ECT0 set while the other packets in
the flow will. Likewise if we switch from a default of DCTCP to cubic we
will see the ECT0 bit set in the SYN/ACK while the other packets in the
flow will not.
====================

Link: https://lore.kernel.org/r/160582070138.66684.11785214534154816097.stgit@localhost.localdomainSigned-off-by: NJakub Kicinski <kuba@kernel.org>

e10823c7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功