提交 · 6cd6cbf593bfa3ae6fc3ed34ac21da4d35045425 · gsplhtlxg / clone-Linux

24 3月, 2020 1 次提交

tcp: repair: fix TCP_QUEUE_SEQ implementation · 6cd6cbf5

由 Eric Dumazet 提交于 3月 18, 2020

When application uses TCP_QUEUE_SEQ socket option to
change tp->rcv_next, we must also update tp->copied_seq.

Otherwise, stuff relying on tcp_inq() being precise can
eventually be confused.

For example, tcp_zerocopy_receive() might crash because
it does not expect tcp_recv_skb() to return NULL.

We could add tests in various places to fix the issue,
or simply make sure tcp_inq() wont return a random value,
and leave fast path as it is.

Note that this fixes ioctl(fd, SIOCINQ, &val) at the same
time.

Fixes: ee995283 ("tcp: Initial repair mode")
Fixes: 05255b82 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6cd6cbf5

22 3月, 2020 13 次提交

selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc · 83a9b6f6

由 Alan Maguire 提交于 3月 18, 2020

Many systems build/test up-to-date kernels with older libcs, and
an older glibc (2.17) lacks the definition of SOL_DCCP in
/usr/include/bits/socket.h (it was added in the 4.6 timeframe).

Adding the definition to the test program avoids a compilation
failure that gets in the way of building tools/testing/selftests/net.
The test itself will work once the definition is added; either
skipping due to DCCP not being configured in the kernel under test
or passing, so there are no other more up-to-date glibc dependencies
here it seems beyond that missing definition.

Fixes: 11fb60d1 ("selftests: net: reuseport_addr_any: add DCCP")
Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83a9b6f6

net: bcmgenet: always enable status blocks · 9a9ba2a4

由 Doug Berger 提交于 3月 17, 2020

The hardware offloading of the NETIF_F_HW_CSUM and NETIF_F_RXCSUM
features requires the use of Transmit Status Blocks before transmit
frame data and Receive Status Blocks before receive frame data to
carry the checksum information.

Unfortunately, these status blocks are currently only enabled when
the NETIF_F_HW_CSUM feature is enabled. As a result NETIF_F_RXCSUM
will not actually be offloaded to the hardware unless both it and
NETIF_F_HW_CSUM are enabled. Fortunately, that is the default
configuration.

This commit addresses this issue by always enabling the use of
status blocks on both transmit and receive frames. Further, it
replaces the use of a dedicated flag within the driver private
data structure with direct use of the netdev features flags.

Fixes: 81015539 ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM")
Signed-off-by: NDoug Berger <opendmb@gmail.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a9ba2a4

net: phy: dp83867: w/a for fld detect threshold bootstrapping issue · 749f6f68

由 Grygorii Strashko 提交于 3月 17, 2020

When the DP83867 PHY is strapped to enable Fast Link Drop (FLD) feature
STRAP_STS2.STRAP_ FLD (reg 0x006F bit 10), the Energy Lost Threshold for
FLD Energy Lost Mode FLD_THR_CFG.ENERGY_LOST_FLD_THR (reg 0x002e bits 2:0)
will be defaulted to 0x2. This may cause the phy link to be unstable. The
new DP83867 DM recommends to always restore ENERGY_LOST_FLD_THR to 0x1.

Hence, restore default value of FLD_THR_CFG.ENERGY_LOST_FLD_THR to 0x1 when
FLD is enabled by bootstrapping as recommended by DM.
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

749f6f68

net: stmmac: dwmac-rk: fix error path in rk_gmac_probe · 9de9aa48

由 Emil Renner Berthing 提交于 3月 21, 2020

Make sure we clean up devicetree related configuration
also when clock init fails.

Fixes: fecd4d7e ("net: stmmac: dwmac-rk: Add integrated PHY support")
Signed-off-by: NEmil Renner Berthing <kernel@esmil.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9de9aa48

slcan: not call free_netdev before rtnl_unlock in slcan_open · 2091a3d4

由 Oliver Hartkopp 提交于 3月 21, 2020

As the description before netdev_run_todo, we cannot call free_netdev
before rtnl_unlock, fix it by reorder the code.

This patch is a 1:1 copy of upstream slip.c commit f596c870
("slip: not call free_netdev before rtnl_unlock in slip_open").
Reported-by: Nyangerkun <yangerkun@huawei.com>
Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2091a3d4

ionic: make spdxcheck.py happy · 06e9bfc1

由 Lukas Bulwahn 提交于 3月 21, 2020

Headers ionic_if.h and ionic_regs.h are licensed under three alternative
licenses and the used SPDX-License-Identifier expression makes
./scripts/spdxcheck.py complain:

drivers/net/ethernet/pensando/ionic/ionic_if.h: 1:52 Syntax error: OR
drivers/net/ethernet/pensando/ionic/ionic_regs.h: 1:52 Syntax error: OR

As OR is associative, it is irrelevant if the parentheses are put around
the first or the second OR-expression.

Simply add parentheses to make spdxcheck.py happy.
Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06e9bfc1

hsr: fix general protection fault in hsr_addr_is_self() · 3a303cfd

由 Taehee Yoo 提交于 3月 21, 2020

The port->hsr is used in the hsr_handle_frame(), which is a
callback of rx_handler.
hsr master and slaves are initialized in hsr_add_port().
This function initializes several pointers, which includes port->hsr after
registering rx_handler.
So, in the rx_handler routine, un-initialized pointer would be used.
In order to fix this, pointers should be initialized before
registering rx_handler.

Test commands:
    ip netns del left
    ip netns del right
    modprobe -rv veth
    modprobe -rv hsr
    killall ping
    modprobe hsr
    ip netns add left
    ip netns add right
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link add veth4 type veth peer name veth5
    ip link set veth1 netns left
    ip link set veth3 netns right
    ip link set veth4 netns left
    ip link set veth5 netns right
    ip link set veth0 up
    ip link set veth2 up
    ip link set veth0 address fc:00:00:00:00:01
    ip link set veth2 address fc:00:00:00:00:02
    ip netns exec left ip link set veth1 up
    ip netns exec left ip link set veth4 up
    ip netns exec right ip link set veth3 up
    ip netns exec right ip link set veth5 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec left ip link add hsr1 type hsr slave1 veth1 slave2 veth4
    ip netns exec left ip a a 192.168.100.2/24 dev hsr1
    ip netns exec left ip link set hsr1 up
    ip netns exec left ip n a 192.168.100.1 dev hsr1 lladdr \
	    fc:00:00:00:00:01 nud permanent
    ip netns exec left ip n r 192.168.100.1 dev hsr1 lladdr \
	    fc:00:00:00:00:01 nud permanent
    for i in {1..100}
    do
        ip netns exec left ping 192.168.100.1 &
    done
    ip netns exec left hping3 192.168.100.1 -2 --flood &
    ip netns exec right ip link add hsr2 type hsr slave1 veth3 slave2 veth5
    ip netns exec right ip a a 192.168.100.3/24 dev hsr2
    ip netns exec right ip link set hsr2 up
    ip netns exec right ip n a 192.168.100.1 dev hsr2 lladdr \
	    fc:00:00:00:00:02 nud permanent
    ip netns exec right ip n r 192.168.100.1 dev hsr2 lladdr \
	    fc:00:00:00:00:02 nud permanent
    for i in {1..100}
    do
        ip netns exec right ping 192.168.100.1 &
    done
    ip netns exec right hping3 192.168.100.1 -2 --flood &
    while :
    do
        ip link add hsr0 type hsr slave1 veth0 slave2 veth2
	ip a a 192.168.100.1/24 dev hsr0
	ip link set hsr0 up
	ip link del hsr0
    done

Splat looks like:
[  120.954938][    C0] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1]I
[  120.957761][    C0] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[  120.959064][    C0] CPU: 0 PID: 1511 Comm: hping3 Not tainted 5.6.0-rc5+ #460
[  120.960054][    C0] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  120.962261][    C0] RIP: 0010:hsr_addr_is_self+0x65/0x2a0 [hsr]
[  120.963149][    C0] Code: 44 24 18 70 73 2f c0 48 c1 eb 03 48 8d 04 13 c7 00 f1 f1 f1 f1 c7 40 04 00 f2 f2 f2 4
[  120.966277][    C0] RSP: 0018:ffff8880d9c09af0 EFLAGS: 00010206
[  120.967293][    C0] RAX: 0000000000000006 RBX: 1ffff1101b38135f RCX: 0000000000000000
[  120.968516][    C0] RDX: dffffc0000000000 RSI: ffff8880d17cb208 RDI: 0000000000000000
[  120.969718][    C0] RBP: 0000000000000030 R08: ffffed101b3c0e3c R09: 0000000000000001
[  120.972203][    C0] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 0000000000000000
[  120.973379][    C0] R13: ffff8880aaf80100 R14: ffff8880aaf800f2 R15: ffff8880aaf80040
[  120.974410][    C0] FS:  00007f58e693f740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[  120.979794][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  120.980773][    C0] CR2: 00007ffcb8b38f29 CR3: 00000000afe8e001 CR4: 00000000000606f0
[  120.981945][    C0] Call Trace:
[  120.982411][    C0]  <IRQ>
[  120.982848][    C0]  ? hsr_add_node+0x8c0/0x8c0 [hsr]
[  120.983522][    C0]  ? rcu_read_lock_held+0x90/0xa0
[  120.984159][    C0]  ? rcu_read_lock_sched_held+0xc0/0xc0
[  120.984944][    C0]  hsr_handle_frame+0x1db/0x4e0 [hsr]
[  120.985597][    C0]  ? hsr_nl_nodedown+0x2b0/0x2b0 [hsr]
[  120.986289][    C0]  __netif_receive_skb_core+0x6bf/0x3170
[  120.992513][    C0]  ? check_chain_key+0x236/0x5d0
[  120.993223][    C0]  ? do_xdp_generic+0x1460/0x1460
[  120.993875][    C0]  ? register_lock_class+0x14d0/0x14d0
[  120.994609][    C0]  ? __netif_receive_skb_one_core+0x8d/0x160
[  120.995377][    C0]  __netif_receive_skb_one_core+0x8d/0x160
[  120.996204][    C0]  ? __netif_receive_skb_core+0x3170/0x3170
[ ... ]

Reported-by: syzbot+fcf5dd39282ceb27108d@syzkaller.appspotmail.com
Fixes: c5a75911 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a303cfd

Merge branch 'hinic-BugFixes' · 4abe5a1b

由 David S. Miller 提交于 3月 21, 2020

Luo bin says:

====================
hinic: BugFixes

Fix a number of bugs which have been present since the first commit.

The bugs fixed in these patchs are hardly exposed unless given
very specific conditions.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4abe5a1b

hinic: fix wrong value of MIN_SKB_LEN · 7296695f

由 Luo bin 提交于 3月 20, 2020

the minimum value of skb len that hw supports is 32 rather than 17
Signed-off-by: NLuo bin <luobin9@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7296695f

hinic: fix wrong para of wait_for_completion_timeout · 0da7c322

由 Luo bin 提交于 3月 20, 2020

the second input parameter of wait_for_completion_timeout should
be jiffies instead of millisecond
Signed-off-by: NLuo bin <luobin9@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0da7c322

hinic: fix out-of-order excution in arm cpu · 33f15da2

由 Luo bin 提交于 3月 20, 2020

add read barrier in driver code to keep from reading other fileds
in dma memory which is writable for hw until we have verified the
memory is valid for driver
Signed-off-by: NLuo bin <luobin9@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33f15da2

hinic: fix the bug of clearing event queue · 614eaa94

由 Luo bin 提交于 3月 20, 2020

should disable eq irq before freeing it, must clear event queue
depth in hw before freeing relevant memory to avoid illegal
memory access and update consumer idx to avoid invalid interrupt
Signed-off-by: NLuo bin <luobin9@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

614eaa94

hinic: fix a bug of waitting for IO stopped · 96758117

由 Luo bin 提交于 3月 20, 2020

it's unreliable for fw to check whether IO is stopped, so driver
wait for enough time to ensure IO process is done in hw before
freeing resources
Signed-off-by: NLuo bin <luobin9@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96758117

21 3月, 2020 2 次提交

tcp: also NULL skb->dev when copy was needed · 07f8e4d0

由 Florian Westphal 提交于 3月 20, 2020

In rare cases retransmit logic will make a full skb copy, which will not
trigger the zeroing added in recent change
b738a185 ("tcp: ensure skb->dev is NULL before leaving TCP stack").

Cc: Eric Dumazet <edumazet@google.com>
Fixes: 75c119af ("tcp: implement rb-tree based retransmit queue")
Fixes: 28f8bfd1 ("netfilter: Support iif matches in POSTROUTING")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07f8e4d0

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 702151da

由 David S. Miller 提交于 3月 20, 2020

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Refetch IP header pointer after pskb_may_pull() in flowtable,
   from Haishuang Yan.

2) Fix memleak in flowtable offload in nf_flow_table_free(),
   from Paul Blakey.

3) Set control.addr_type mask in flowtable offload, from Edward Cree.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

702151da

20 3月, 2020 10 次提交

tcp: ensure skb->dev is NULL before leaving TCP stack · b738a185

由 Eric Dumazet 提交于 3月 19, 2020

skb->rbnode is sharing three skb fields : next, prev, dev

When a packet is sent, TCP keeps the original skb (master)
in a rtx queue, which was converted to rbtree a while back.

__tcp_transmit_skb() is responsible to clone the master skb,
and add the TCP header to the clone before sending it
to network layer.

skb_clone() already clears skb->next and skb->prev, but copies
the master oskb->dev into the clone.

We need to clear skb->dev, otherwise lower layers could interpret
the value as a pointer to a netdev.

This old bug surfaced recently when commit 28f8bfd1
("netfilter: Support iif matches in POSTROUTING") was merged.

Before this netfilter commit, skb->dev value was ignored and
changed before reaching dev_queue_xmit()

Fixes: 75c119af ("tcp: implement rb-tree based retransmit queue")
Fixes: 28f8bfd1 ("netfilter: Support iif matches in POSTROUTING")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NMartin Zaharinov <micron10@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b738a185

cxgb4: fix Txq restart check during backpressure · f1f20a86

由 Rahul Lakkireddy 提交于 3月 19, 2020

Driver reclaims descriptors in much smaller batches, even if hardware
indicates more to reclaim, during backpressure. So, fix the check to
restart the Txq during backpressure, by looking at how many
descriptors hardware had indicated to reclaim, and not on how many
descriptors that driver had actually reclaimed. Once the Txq is
restarted, driver will reclaim even more descriptors when Tx path
is entered again.

Fixes: d429005f ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1f20a86

cxgb4: fix throughput drop during Tx backpressure · 7affd808

由 Rahul Lakkireddy 提交于 3月 19, 2020

commit 7c3bebc3 ("cxgb4: request the TX CIDX updates to status page")
reverted back to getting Tx CIDX updates via DMA, instead of interrupts,
introduced by commit d429005f ("cxgb4/cxgb4vf: Add support for SGE
doorbell queue timer")

However, it missed reverting back several code changes where Tx CIDX
updates are not explicitly requested during backpressure when using
interrupt mode. These missed changes cause slow recovery during
backpressure because the corresponding interrupt no longer comes and
hence results in Tx throughput drop.

So, revert back these missed code changes, as well, which will allow
explicitly requesting Tx CIDX updates when backpressure happens.
This enables the corresponding interrupt with Tx CIDX update message
to get generated and hence speed up recovery and restore back
throughput.

Fixes: 7c3bebc3 ("cxgb4: request the TX CIDX updates to status page")
Fixes: d429005f ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7affd808

net: dsa: mt7530: Change the LINK bit to reflect the link status · 22259471

由 René van Dorst 提交于 3月 19, 2020

Andrew reported:

After a number of network port link up/down changes, sometimes the switch
port gets stuck in a state where it thinks it is still transmitting packets
but the cpu port is not actually transmitting anymore. In this state you
will see a message on the console
"mtk_soc_eth 1e100000.ethernet eth0: transmit timed out" and the Tx counter
in ifconfig will be incrementing on virtual port, but not incrementing on
cpu port.

The issue is that MAC TX/RX status has no impact on the link status or
queue manager of the switch. So the queue manager just queues up packets
of a disabled port and sends out pause frames when the queue is full.

Change the LINK bit to reflect the link status.

Fixes: b8f126a8 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Reported-by: NAndrew Smith <andrew.smith@digi.com>
Signed-off-by: NRené van Dorst <opensource@vdorst.com>
Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22259471

Merge tag 'rxrpc-fixes-20200319' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 3ac9eb42

由 David S. Miller 提交于 3月 19, 2020

David Howells says:

====================
rxrpc, afs: Interruptibility fixes

Here are a number of fixes for AF_RXRPC and AFS that make AFS system calls
less interruptible and so less likely to leave the filesystem in an
uncertain state.  There's also a miscellaneous patch to make tracing
consistent.

 (1) Firstly, abstract out the Tx space calculation in sendmsg.  Much the
     same code is replicated in a number of places that subsequent patches
     are going to alter, including adding another copy.

 (2) Fix Tx interruptibility by allowing a kernel service, such as AFS, to
     request that a call be interruptible only when waiting for a call slot
     to become available (ie. the call has not taken place yet) or that a
     call be not interruptible at all (e.g. when we want to do writeback
     and don't want a signal interrupting a VM-induced writeback).

 (3) Increase the minimum delay on MSG_WAITALL for userspace sendmsg() when
     waiting for Tx buffer space as a 2*RTT delay is really small over 10G
     ethernet and a 1 jiffy timeout might be essentially 0 if at the end of
     the jiffy period.

 (4) Fix some tracing output in AFS to make it consistent with rxrpc.

 (5) Make sure aborted asynchronous AFS operations are tidied up properly
     so we don't end up with stuck rxrpc calls.

 (6) Make AFS client calls uninterruptible in the Rx phase.  If we don't
     wait for the reply to be fully gathered, we can't update the local VFS
     state and we end up in an indeterminate state with respect to the
     server.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ac9eb42

mlxsw: pci: Only issue reset when system is ready · 6002059d

由 Ido Schimmel 提交于 3月 19, 2020

During initialization the driver issues a software reset command and
then waits for the system status to change back to "ready" state.

However, before issuing the reset command the driver does not check that
the system is actually in "ready" state. On Spectrum-{1,2} systems this
was always the case as the hardware initialization time is very short.
On Spectrum-3 systems this is no longer the case. This results in the
software reset command timing-out and the driver failing to load:

[ 6.347591] mlxsw_spectrum3 0000:06:00.0: Cmd exec timed-out (opcode=40(ACCESS_REG),opcode_mod=0,in_mod=0)
[ 6.358382] mlxsw_spectrum3 0000:06:00.0: Reg cmd access failed (reg_id=9023(mrsr),type=write)
[ 6.368028] mlxsw_spectrum3 0000:06:00.0: cannot register bus device
[ 6.375274] mlxsw_spectrum3: probe of 0000:06:00.0 failed with error -110

Fix this by waiting for the system to become ready both before issuing
the reset command and afterwards. In case of failure, print the last
system status to aid in debugging.

Fixes: da382875 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6002059d

netfilter: flowtable: populate addr_type mask · 15ff1972

由 Edward Cree 提交于 3月 19, 2020

nf_flow_rule_match() sets control.addr_type in key, so needs to also set
the corresponding mask. An exact match is wanted, so mask is all ones.

Fixes: c29f74e0 ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

15ff1972

netfilter: flowtable: Fix flushing of offloaded flows on free · c921ffe8

由 Paul Blakey 提交于 3月 19, 2020

Freeing a flowtable with offloaded flows, the flow are deleted from
hardware but are not deleted from the flow table, leaking them,
and leaving their offload bit on.

Add a second pass of the disabled gc to delete the these flows from
the flow table before freeing it.

Fixes: c29f74e0 ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c921ffe8

netfilter: flowtable: reload ip{v6}h in nf_flow_tuple_ip{v6} · 41e9ec5a

由 Haishuang Yan 提交于 3月 17, 2020

Since pskb_may_pull may change skb->data, so we need to reload ip{v6}h at
the right place.

Fixes: a908fdec ("netfilter: nf_flow_table: move ipv6 offload hook code to nf_flow_table")
Fixes: 7d208687 ("netfilter: nf_flow_table: move ipv4 offload hook code to nf_flow_table")
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

41e9ec5a

netfilter: flowtable: reload ip{v6}h in nf_flow_nat_ip{v6} · 61abaf02

由 Haishuang Yan 提交于 3月 17, 2020

Since nf_flow_snat_port and nf_flow_snat_ip{v6} call pskb_may_pull()
which may change skb->data, so we need to reload ip{v6}h at the right
place.

61abaf02

19 3月, 2020 8 次提交

Merge branch 'wireguard-fixes' · 3c025b63

由 David S. Miller 提交于 3月 18, 2020

Jason A. Donenfeld says:

====================
wireguard fixes for 5.6-rc7

I originally intended to spend this cycle working on fun optimizations
and architecture for WireGuard for 5.7, but I've been a bit neurotic
about having 5.6 ship without any show stopper bugs. WireGuard has been
stable for a long time now, but that doesn't make me any less nervous
about the real deal in 5.6. To that end, I've been doing code reviews
and having discussions, and we also had a security firm audit the code.
That audit didn't turn up any vulnerabilities, but they did make a good
defense-in-depth suggestion. This series contains:

1) Removal of a duplicated header, from YueHaibing.
2) Testing with 64-bit time in our test suite.
3) Account for skb->protocol==0 due to AF_PACKET sockets, suggested
   by Florian Fainelli.
4) Clean up some code in an unreachable switch/case branch, suggested
   by Florian Fainelli.
5) Better handling of low-order points, discussed with Mathias
   Hall-Andersen.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c025b63

wireguard: noise: error out precomputed DH during handshake rather than config · 11a7686a

由 Jason A. Donenfeld 提交于 3月 18, 2020

We precompute the static-static ECDH during configuration time, in order
to save an expensive computation later when receiving network packets.
However, not all ECDH computations yield a contributory result. Prior,
we were just not letting those peers be added to the interface. However,
this creates a strange inconsistency, since it was still possible to add
other weird points, like a valid public key plus a low-order point, and,
like points that result in zeros, a handshake would not complete. In
order to make the behavior more uniform and less surprising, simply
allow all peers to be added. Then, we'll error out later when doing the
crypto if there's an issue. This also adds more separation between the
crypto layer and the configuration layer.

Discussed-with: Mathias Hall-Andersen <mathias@hall-andersen.dk>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

11a7686a

wireguard: receive: remove dead code from default packet type case · 2b8765c5

由 Jason A. Donenfeld 提交于 3月 18, 2020

The situation in which we wind up hitting the default case here
indicates a major bug in earlier parsing code. It is not a usual thing
that should ever happen, which means a "friendly" message for it doesn't
make sense. Rather, replace this with a WARN_ON, just like we do earlier
in the file for a similar situation, so that somebody sends us a bug
report and we can fix it.
Reported-by: NFabian Freyer <fabianfreyer@radicallyopensecurity.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8765c5

wireguard: queueing: account for skb->protocol==0 · a5588604

由 Jason A. Donenfeld 提交于 3月 18, 2020

We carry out checks to the effect of:

  if (skb->protocol != wg_examine_packet_protocol(skb))
    goto err;

By having wg_skb_examine_untrusted_ip_hdr return 0 on failure, this
means that the check above still passes in the case where skb->protocol
is zero, which is possible to hit with AF_PACKET:

  struct sockaddr_pkt saddr = { .spkt_device = "wg0" };
  unsigned char buffer[5] = { 0 };
  sendto(socket(AF_PACKET, SOCK_PACKET, /* skb->protocol = */ 0),
         buffer, sizeof(buffer), 0, (const struct sockaddr *)&saddr, sizeof(saddr));

Additional checks mean that this isn't actually a problem in the code
base, but I could imagine it becoming a problem later if the function is
used more liberally.

I would prefer to fix this by having wg_examine_packet_protocol return a
32-bit ~0 value on failure, which will never match any value of
skb->protocol, which would simply change the generated code from a mov
to a movzx. However, sparse complains, and adding __force casts doesn't
seem like a good idea, so instead we just add a simple helper function
to check for the zero return value. Since wg_examine_packet_protocol
itself gets inlined, this winds up not adding an additional branch to
the generated code, since the 0 return value already happens in a
mergable branch.
Reported-by: NFabian Freyer <fabianfreyer@radicallyopensecurity.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5588604

wireguard: selftests: test using new 64-bit time_t · 551599ed

由 Jason A. Donenfeld 提交于 3月 18, 2020

In case this helps expose bugs with the newer 64-bit time_t types, we do
our testing with the newer musl that supports this as well as
CONFIG_COMPAT_32BIT_TIME=n. This matters to us, since wireguard does in
fact deal with timestamps.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

551599ed

wireguard: selftests: remove duplicated include <sys/types.h> · 16639115

由 YueHaibing 提交于 3月 18, 2020

This commit removes a duplicated include.
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16639115

vxlan: check return value of gro_cells_init() · 384d91c2

由 Taehee Yoo 提交于 3月 18, 2020

gro_cells_init() returns error if memory allocation is failed.
But the vxlan module doesn't check the return value of gro_cells_init().

Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")`
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

384d91c2

net/sched: act_ct: Fix leak of ct zone template on replace · dd2af104

由 Paul Blakey 提交于 3月 18, 2020

Currently, on replace, the previous action instance params
is swapped with a newly allocated params. The old params is
only freed (via kfree_rcu), without releasing the allocated
ct zone template related to it.

Call tcf_ct_params_free (via call_rcu) for the old params,
so it will release it.

Fixes: b57dc7c1 ("net/sched: Introduce action ct")
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd2af104

18 3月, 2020 6 次提交

net: core: dev.c: fix a documentation warning · 2de9780f

由 Mauro Carvalho Chehab 提交于 3月 17, 2020

There's a markup for link with is "foo_". On this kernel-doc
comment, we don't want this, but instead, place a literal
reference. So, escape the literal with ``foo``, in order to
avoid this warning:

	./net/core/dev.c:5195: WARNING: Unknown target name: "page_is".
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2de9780f

net: phy: sfp-bus.c: get rid of docs warnings · 6497ca07

由 Mauro Carvalho Chehab 提交于 3月 17, 2020

The indentation for the returned values are weird, causing those
warnings:

	./drivers/net/phy/sfp-bus.c:579: WARNING: Unexpected indentation.
	./drivers/net/phy/sfp-bus.c:619: WARNING: Unexpected indentation.

Use a list and change the identation for it to be properly
parsed by the documentation toolchain.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6497ca07

Merge branch 'ENA-driver-bug-fixes' · 15538575

由 David S. Miller 提交于 3月 17, 2020

Arthur Kiyanovski says:

====================
ENA driver bug fixes
====================
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15538575

net: ena: fix continuous keep-alive resets · dfdde134

由 Arthur Kiyanovski 提交于 3月 17, 2020

last_keep_alive_jiffies is updated in probe and when a keep-alive
event is received.  In case the driver times-out on a keep-alive event,
it has high chances of continuously timing-out on keep-alive events.
This is because when the driver recovers from the keep-alive-timeout reset
the value of last_keep_alive_jiffies is very old, and if a keep-alive
event is not received before the next timer expires, the value of
last_keep_alive_jiffies will cause another keep-alive-timeout reset
and so forth in a loop.

Solution:
Update last_keep_alive_jiffies whenever the device is restored after
reset.

Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NNoam Dagan <ndagan@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfdde134

net: ena: avoid memory access violation by validating req_id properly · 30623e1e

由 Arthur Kiyanovski 提交于 3月 17, 2020

Rx req_id is an index in struct ena_eth_io_rx_cdesc_base.
The driver should validate that the Rx req_id it received from
the device is in range [0, ring_size -1].  Failure to do so could
yield to potential memory access violoation.
The validation was mistakenly done when refilling
the Rx submission queue and not in Rx completion queue.

Fixes: ad974bae ("net: ena: add support for out of order rx buffers refill")
Signed-off-by: NNoam Dagan <ndagan@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30623e1e

net: ena: fix request of incorrect number of IRQ vectors · e02ae6ed

由 Arthur Kiyanovski 提交于 3月 17, 2020

Bug:
In short the main issue is caused by the fact that the number of queues
is changed using ethtool after ena_probe() has been called and before
ena_up() was executed. Here is the full scenario in detail:

* ena_probe() is called when the driver is loaded, the driver is not up
  yet at the end of ena_probe().
* The number of queues is changed -> io_queue_count is changed as well -
  ena_up() is not called since the "dev_was_up" boolean in
  ena_update_queue_count() is false.
* ena_up() is called by the kernel (it's called asynchronously some
  time after ena_probe()). ena_setup_io_intr() is called by ena_up() and
  it uses io_queue_count to get the suitable irq lines for each msix
  vector. The function ena_request_io_irq() is called right after that
  and it uses msix_vecs - This value only changes during ena_probe() and
  ena_restore() - to request the irq vectors. This results in "Failed to
  request I/O IRQ" error for i > io_queue_count.

Numeric example:
* After ena_probe() io_queue_count = 8, msix_vecs = 9.
* The number of queues changes to 4 -> io_queue_count = 4, msix_vecs = 9.
* ena_up() is executed for the first time:
  ** ena_setup_io_intr() inits the vectors only up to io_queue_count.
  ** ena_request_io_irq() calls request_irq() and fails for i = 5.

How to reproduce:
simply run the following commands:
    sudo rmmod ena && sudo insmod ena.ko;
    sudo ethtool -L eth1 combined 3;

Fix:
Use ENA_MAX_MSIX_VEC(adapter->num_io_queues + adapter->xdp_num_queues)
instead of adapter->msix_vecs. We need to take XDP queues into
consideration as they need to have msix vectors assigned to them as well.
Note that the XDP cannot be attached before the driver is up and running
but in XDP mode the issue might occur when the number of queues changes
right after a reset trigger.
The ENA_MAX_MSIX_VEC simply adds one to the argument since the first msix
vector is reserved for management queue.

Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e02ae6ed