提交 · ecc832758a654e375924ebf06a4ac971acb5ce60 · openeuler / Kernel

01 3月, 2018 12 次提交

net/tcp/illinois: replace broken algorithm reference link · ecc83275

由 Joey Pabalinas 提交于 2月 27, 2018

The link to the pdf containing the algorithm description is now a
dead link; it seems http://www.ifp.illinois.edu/~srikant/ has been
moved to https://sites.google.com/a/illinois.edu/srikant/ and none of
the original papers can be found there...

I have replaced it with the only working copy I was able to find.

n.b. there is also a copy available at:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.6350&rep=rep1&type=pdf

However, this seems to only be a *cached* version, so I am unsure
exactly how reliable that link can be expected to remain over time
and have decided against using that one.
Signed-off-by: NJoey Pabalinas <joeypabalinas@gmail.com>

 1 file changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ecc83275

tcp: purge write queue upon RST · a27fd7a8

由 Soheil Hassas Yeganeh 提交于 2月 27, 2018

When the connection is reset, there is no point in
keeping the packets on the write queue until the connection
is closed.

RFC 793 (page 70) and RFC 793-bis (page 64) both suggest
purging the write queue upon RST:
https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07

Moreover, this is essential for a correct MSG_ZEROCOPY
implementation, because userspace cannot call close(fd)
before receiving zerocopy signals even when the connection
is reset.

Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a27fd7a8

Merge branch 'tcp-revert-a-F-RTO-extension-due-to-broken-middle-boxes' · 55e84dd7

由 David S. Miller 提交于 2月 28, 2018

Yuchung Cheng says:

====================
tcp: revert a F-RTO extension due to broken middle-boxes

This patch series reverts a (non-standard) TCP F-RTO extension that aimed
to detect more spurious timeouts. Unfortunately it could result in poor
performance due to broken middle-boxes that modify TCP packets. E.g.
https://www.spinics.net/lists/netdev/msg484154.html
We believe the best and simplest solution is to just revert the change.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55e84dd7

tcp: revert F-RTO extension to detect more spurious timeouts · fc68e171

由 Yuchung Cheng 提交于 2月 27, 2018

This reverts commit 89fe18e4.

While the patch could detect more spurious timeouts, it could cause
poor TCP performance on broken middle-boxes that modifies TCP packets
(e.g. receive window, SACK options). Since the performance gain is
much smaller compared to the potential loss. The best solution is
to fully revert the change.

Fixes: 89fe18e4 ("tcp: extend F-RTO to catch more spurious timeouts")
Reported-by: NTeodor Milkov <tm@del.bg>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc68e171

tcp: revert F-RTO middle-box workaround · d4131f09

由 Yuchung Cheng 提交于 2月 27, 2018

This reverts commit cc663f4d. While fixing
some broken middle-boxes that modifies receive window fields, it does not
address middle-boxes that strip off SACK options. The best solution is
to fully revert this patch and the root F-RTO enhancement.

Fixes: cc663f4d ("tcp: restrict F-RTO to work-around broken middle-boxes")
Reported-by: NTeodor Milkov <tm@del.bg>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4131f09

Merge branch 's390-qeth-fixes' · c8431622

由 David S. Miller 提交于 2月 28, 2018

Julian Wiedmann says:

====================
s390/qeth: fixes 2018-02-27

please apply some more qeth patches for -net and stable.

One patch fixes a performance bug in the TSO path. Then there's several
more fixes for IP management on L3 devices - including a revert, so that
the subsequent fix cleanly applies to earlier kernels.
The final patch takes care of a race in the control IO code that causes
qeth to miss the cmd response, and subsequently trigger device recovery.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8431622

s390/qeth: fix IPA command submission race · d22ffb5a

由 Julian Wiedmann 提交于 2月 27, 2018

If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.

So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().

Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.

As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d22ffb5a

s390/qeth: fix IP address lookup for L3 devices · c5c48c58

由 Julian Wiedmann 提交于 2月 27, 2018

Current code ("qeth_l3_ip_from_hash()") matches a queried address object
against objects in the IP table by IP address, Mask/Prefix Length and
MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
require is either
a) "is this IP address registered" (ie. match by IP address only),
before adding a new address.
b) or "is this address object registered" (ie. match all relevant
   attributes), before deleting an address.

Right now
1. the ADD path is too strict in its lookup, and eg. doesn't detect
conflicts between an existing NORMAL address and a new VIPA address
(because the NORMAL address will have mask != 0, while VIPA has
a mask == 0),
2. the DELETE path is not strict enough, and eg. allows del_rxip() to
delete a VIPA address as long as the IP address matches.

Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
that do the appropriate checking.

Note that the ADD path for NORMAL addresses is special, as qeth keeps
track of how many times such an address is in use (and there is no
immediate way of returning errors to the caller). So when a requested
NORMAL address _fully_ matches an existing one, it's not considered a
conflict and we merely increment the refcount.

Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5c48c58

Revert "s390/qeth: fix using of ref counter for rxip addresses" · 4964c66f

由 Julian Wiedmann 提交于 2月 27, 2018

This reverts commit cb816192.

The issue this attempted to fix never actually occurs.
l3_add_rxip() checks (via l3_ip_from_hash()) if the requested address
was previously added to the card. If so, it returns -EEXIST and doesn't
call l3_add_ip().
As a result, the "address exists" path in l3_add_ip() is never taken
for rxip addresses, and this patch had no effect.

Fixes: cb816192 ("s390/qeth: fix using of ref counter for rxip addresses")
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4964c66f

s390/qeth: fix double-free on IP add/remove race · 14d066c3

由 Julian Wiedmann 提交于 2月 27, 2018

Registering an IPv4 address with the HW takes quite a while, so we
temporarily drop the ip_htable lock. Any concurrent add/remove of the
same IP adjusts the IP's use count, and (on remove) is then blocked by
addr->in_progress.
After the register call has completed, we check the use count for
concurrently attempted add/remove calls - and possibly straight-away
deregister the IP again. This happens via l3_delete_ip(), which
1) looks up the queried IP in the htable (getting a reference to the
   *same* queried object),
2) deregisters the IP from the HW, and
3) frees the IP object.

The caller in l3_add_ip() then does a second free on the same object.

For this case, skip all the extra checks and lookups in l3_delete_ip()
and just deregister & free the IP object ourselves.

Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14d066c3

s390/qeth: fix IP removal on offline cards · 98d823ab

由 Julian Wiedmann 提交于 2月 27, 2018

If the HW is not reachable, then none of the IPs in qeth's internal
table has been registered with the HW yet. So when deleting such an IP,
there's no need to stage it for deregistration - just drop it from
the table.

This fixes the "add-delete-add" scenario on an offline card, where the
the second "add" merely increments the IP's use count. But as the IP is
still set to DISP_ADDR_DELETE from the previous "delete" step,
l3_recover_ip() won't register it with the HW when the card goes online.

Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98d823ab

s390/qeth: fix overestimated count of buffer elements · 12472af8

由 Julian Wiedmann 提交于 2月 27, 2018

qeth_get_elements_for_range() doesn't know how to handle a 0-length
range (ie. start == end), and returns 1 when it should return 0.
Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
of the skb's linear data) are skipped when mapping the skb into regular
buffer elements.

This overestimation may cause several performance-related issues:
1. sub-optimal IO buffer selection, where the next buffer gets selected
   even though the skb would actually still fit into the current buffer.
2. forced linearization, if the element count for a non-linear skb
   exceeds QETH_MAX_BUFFER_ELEMENTS.

Rather than modifying qeth_get_elements_for_range() and adding overhead
to every caller, fix up those callers that are in risk of passing a
0-length range.

Fixes: 2863c613 ("qeth: refactor calculation of SBALE count")
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

12472af8

28 2月, 2018 12 次提交

gianfar: Fix Rx byte accounting for ndev stats · 590399dd

由 Claudiu Manoil 提交于 2月 27, 2018

Don't include in the Rx bytecount of the packet sent up the stack:
the FCB (frame control block), and the padding bytes inserted by
the controller into the frame payload, nor the FCS. All these are
being pulled out of the skb by gfar_process_frame().
This issue is old, likely from the driver's beginnings, however
it was amplified by recent:
commit d903ec77 ("gianfar: simplify FCS handling and fix memory leak")
which basically added the FCS to the Rx bytecount, and so brought
this to my attention.
Signed-off-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

590399dd

cdc_ether: flag the Cinterion PLS8 modem by gemalto as WWAN · 8ca88b54

由 Bassem Boubaker 提交于 2月 27, 2018

The Cinterion PL8 is an LTE modem with 2 possible WWAN interfaces.

The modem is controlled via AT commands through the exposed TTYs.

AT^SWWAN write command can be used to activate or deactivate a WWAN
connection for a PDP context defined with AT+CGDCONT. UE supports
two WWAN adapter. Both WWAN adapters can be activated a the same time
Signed-off-by: NBassem Boubaker <bassem.boubaker@actia.fr>
Acked-by: NOliver Neukum <oneukum@suse.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ca88b54

tls: Use correct sk->sk_prot for IPV6 · c113187d

由 Boris Pismenny 提交于 2月 27, 2018

The tls ulp overrides sk->prot with a new tls specific proto structs.
The tls specific structs were previously based on the ipv4 specific
tcp_prot sturct.
As a result, attaching the tls ulp to an ipv6 tcp socket replaced
some ipv6 callback with the ipv4 equivalents.

This patch adds ipv6 tls proto structs and uses them when
attached to ipv6 sockets.

Fixes: 3c4d7559 ('tls: kernel TLS support')
Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c113187d

sh_eth: uninline TSU register accessors · 55ea8743

由 Sergei Shtylyov 提交于 2月 27, 2018

We have uninlined the sh_eth_{read|write}() functions introduced in the
commit 4a55530f ("net: sh_eth: modify the definitions of register").
Now remove *inline* from sh_eth_tsu_{read|write}() as well and move
these functions from the header to the driver itself. This saves 684
more bytes of object code (ARM gcc 4.8.5)...
Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55ea8743

Merge branch 'tunnel-mtu-fixes' · ff2926d8

由 David S. Miller 提交于 2月 27, 2018

Xin Long says:

====================
net: fix IFLA_MTU ignored on NEWLINK for some ip and ipv6 tunnels

The fix for ip_gre follows the way other ip tunnels do: not to
set mtu in ndo_init, as ip_tunnel_newlink will take care of it
properly.

The fix for ip6_tunnel and sit follows the way ipv6 tunenls do:
to set mtu again according to IFLA_MTU after, as all bind_dev
are called in ndo_init where it can't get the tb[IFLA_MTU].
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff2926d8

sit: fix IFLA_MTU ignored on NEWLINK · 2b3957c3

由 Xin Long 提交于 2月 27, 2018

Commit 128bb975 ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for sit.

Note that dev->hard_header_len setting for sit works fine, no need to
fix it. sit is actually ipv4 tunnel, it can't call ip6_tnl_change_mtu
to set mtu.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b3957c3

ip6_tunnel: fix IFLA_MTU ignored on NEWLINK · a6aa8044

由 Xin Long 提交于 2月 27, 2018

Commit 128bb975 ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for ip6_tunnel.

Note that dev->hard_header_len setting for ip6_tunnel works fine,
no need to fix it.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6aa8044

ip_gre: fix IFLA_MTU ignored on NEWLINK · ffc2b6ee

由 Xin Long 提交于 2月 27, 2018

It's safe to remove the setting of dev's needed_headroom and mtu in
__gre_tunnel_init, as discussed in [1], ip_tunnel_newlink can do it
properly.

Now Eric noticed that it could cover the mtu value set in do_setlink
when creating a ip_gre dev. It makes IFLA_MTU param not take effect.

So this patch is to remove them to make IFLA_MTU work, as in other
ipv4 tunnels.

  [1]: https://patchwork.ozlabs.org/patch/823504/

Fixes: c5441932 ("GRE: Refactor GRE tunneling code.")
Reported-by: NEric Garver <e@erig.me>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffc2b6ee

net: phy: Restore phy_resume() locking assumption · 9c2c2e62

由 Andrew Lunn 提交于 2月 27, 2018

commit f5e64032 ("net: phy: fix resume handling") changes the
locking semantics for phy_resume() such that the caller now needs to
hold the phy mutex. Not all call sites were adopted to this new
semantic, resulting in warnings from the added
WARN_ON(!mutex_is_locked(&phydev->lock)).  Rather than change the
semantics, add a __phy_resume() and restore the old behavior of
phy_resume().
Reported-by: NHeiner Kallweit <hkallweit1@gmail.com>
Fixes: f5e64032 ("net: phy: fix resume handling")
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c2c2e62

tipc: correct initial value for group congestion flag · 1b22bcad

由 Jon Maloy 提交于 2月 26, 2018

In commit 60c25306 ("tipc: fix race between poll() and
setsockopt()") we introduced a pointer from struct tipc_group to the
'group_is_connected' flag in struct tipc_sock, so that this field can
be checked without dereferencing the group pointer of the latter struct.

The initial value for this flag is correctly set to 'false' when a
group is created, but we miss the case when no group is created at
all, in which case the initial value should be 'true'. This has the
effect that SOCK_RDM/DGRAM sockets sending datagrams never receive
POLLOUT if they request so.

This commit corrects this bug.

Fixes: 60c25306 ("tipc: fix race between poll() and setsockopt()")
Reported-by: NHoang Le <hoang.h.le@dektek.com.au>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b22bcad

devlink: Fix resource coverity errors · 3d18e4f1

由 Arkadi Sharshevsky 提交于 2月 26, 2018

Fix resource coverity errors.

Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d18e4f1

net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68 · c7272c2f

由 Sabrina Dubroca 提交于 2月 26, 2018

According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:

    A host MUST never reduce its estimate of the Path MTU below 68
    octets.

and (talking about ICMP frag-needed's Next-Hop MTU field):

    This field will never contain a value less than 68, since every
    router "must be able to forward a datagram of 68 octets without
    fragmentation".

Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.

Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7272c2f

27 2月, 2018 16 次提交

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 68b116a2

由 David S. Miller 提交于 2月 27, 2018

Johan Hedberg says:

====================
pull request: bluetooth 2018-02-26

Here are a two Bluetooth driver fixes for the 4.16 kernel.

Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68b116a2

devlink: Compare to size_new in case of resource child validation · b9d17175

由 Arkadi Sharshevsky 提交于 2月 26, 2018

The current implementation checks the combined size of the children with
the 'size' of the parent. The correct behavior is to check the combined
size vs the pending change and to compare vs the 'size_new'.

Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Tested-by: NYuval Mintz <yuvalm@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9d17175

r8152: fix tx packets accounting · 4c27bf3c

由 Eric Dumazet 提交于 2月 25, 2018

r8152 driver handles TSO packets (limited to ~16KB) quite well,
but pretends each TSO logical packet is a single packet on the wire.

There is also some error since headers are accounted once, but
error rate is small enough that we do not care.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c27bf3c

ip_tunnel: Do not use mark in skb by default · 4e994776

由 Thomas Winter 提交于 2月 26, 2018

This reverts commit 5c38bd1b.

skb->mark contains the mark the encapsulated traffic which
can result in incorrect routing decisions being made such
as routing loops if the route chosen is via tunnel itself.
The correct method should be to use tunnel->fwmark.
Signed-off-by: NThomas Winter <thomas.winter@alliedtelesis.co.nz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e994776

bridge: Fix VLAN reference count problem · 0e5a82ef

由 Ido Schimmel 提交于 2月 25, 2018

When a VLAN is added on a port, a reference is taken on the
corresponding master VLAN entry. If it does not already exist, then it
is created and a reference taken.

However, in the second case a reference is not really taken when
CONFIG_REFCOUNT_FULL is enabled as refcount_inc() is replaced by
refcount_inc_not_zero().

Fix this by using refcount_set() on a newly created master VLAN entry.

Fixes: 25127759 ("net, bridge: convert net_bridge_vlan.refcnt from atomic_t to refcount_t")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e5a82ef

DT: net: renesas,ravb: document R8A77980 bindings · 3a291aa1

由 Sergei Shtylyov 提交于 2月 01, 2018

Renesas R-Car V3H (R8A77980) SoC has the R-Car gen3 compatible EtherAVB
device, so document the SoC specific bindings.
Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a291aa1

qrtr: add MODULE_ALIAS macro to smd · c77f5fbb

由 Ramon Fried 提交于 2月 25, 2018

Added MODULE_ALIAS("rpmsg:IPCRTR") to ensure qrtr-smd and qrtr will load
when IPCRTR channel is detected.
Signed-off-by: NRamon Fried <rfried@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c77f5fbb

hdlc_ppp: carrier detect ok, don't turn off negotiation · b6c3bad1

由 Denis Du 提交于 2月 24, 2018

Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.
Signed-off-by: NDenis Du <dudenis2000@yahoo.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6c3bad1

tuntap: correctly add the missing XDP flush · 1bb4f2e8

由 Jason Wang 提交于 2月 24, 2018

We don't flush batched XDP packets through xdp_do_flush_map(), this
will cause packets stall at TX queue. Consider we don't do XDP on NAPI
poll(), the only possible fix is to call xdp_do_flush_map()
immediately after xdp_do_redirect().

Note, this in fact won't try to batch packets through devmap, we could
address in the future.
Reported-by: NChristoffer Dall <christoffer.dall@linaro.org>
Fixes: 761876c8 ("tap: XDP support")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1bb4f2e8

tuntap: disable preemption during XDP processing · 23e43f07

由 Jason Wang 提交于 2月 24, 2018

Except for tuntap, all other drivers' XDP was implemented at NAPI
poll() routine in a bh. This guarantees all XDP operation were done at
the same CPU which is required by e.g BFP_MAP_TYPE_PERCPU_ARRAY. But
for tuntap, we do it in process context and we try to protect XDP
processing by RCU reader lock. This is insufficient since
CONFIG_PREEMPT_RCU can preempt the RCU reader critical section which
breaks the assumption that all XDP were processed in the same CPU.

Fixing this by simply disabling preemption during XDP processing.

Fixes: 761876c8 ("tap: XDP support")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23e43f07

Revert "tuntap: add missing xdp flush" · f249be4d

由 Jason Wang 提交于 2月 24, 2018

This reverts commit 762c330d. The
reason is we try to batch packets for devmap which causes calling
xdp_do_flush() in the process context. Simply disabling preemption
may not work since process may move among processors which lead
xdp_do_flush() to miss some flushes on some processors.

So simply revert the patch, a follow-up patch will add the xdp flush
correctly.
Reported-by: NChristoffer Dall <christoffer.dall@linaro.org>
Fixes: 762c330d ("tuntap: add missing xdp flush")
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f249be4d

ixgbe: fix crash in build_skb Rx code path · 0c5661ec

由 Emil Tantilov 提交于 2月 23, 2018

Add check for build_skb enabled ring in ixgbe_dma_sync_frag().
In that case &skb_shinfo(skb)->frags[0] may not always be set which
can lead to a crash. Instead we derive the page offset from skb->data.

Fixes: 42073d91
("ixgbe: Have the CPU take ownership of the buffers sooner")
CC: stable <stable@vger.kernel.org>
Reported-by: NAmbarish Soman <asoman@redhat.com>
Suggested-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c5661ec

ARM: orion5x: Revert commit . · 13a55372

由 David S. Miller 提交于 2月 26, 2018

It is not valid for orion5x to use mac_pton().

First of all, the orion5x buffer is not NULL terminated.  mac_pton()
has no business operating on non-NULL terminated buffers because
only the caller can know that this is valid and in what manner it
is ok to parse this NULL'less buffer.

Second of all, orion5x operates on an __iomem pointer, which cannot
be dereferenced using normal C pointer operations.  Accesses to
such areas much be performed with the proper iomem accessors.

Fixes: 4904dbda ("ARM: orion5x: use mac_pton() helper")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13a55372

Merge branch 'l2tp-fix-API-races-discovered-by-syzbot' · 44e524cf

由 David S. Miller 提交于 2月 26, 2018

James Chapman says:

====================
l2tp: fix API races discovered by syzbot

This patch series addresses several races with L2TP APIs discovered by
syzbot. There are no functional changes.

The set of patches 1-5 in combination fix the following syzbot reports.

19c09769f WARNING in debug_print_object
347bd5acd KASAN: use-after-free Read in inet_shutdown
6e6a5ec8d general protection fault in pppol2tp_connect
9df43faf0 KASAN: use-after-free Read in pppol2tp_connect

My first attempts to fix these issues were as net-next patches but
the series included other refactoring and cleanup work. I was asked to
separate out the bugfixes and redo for the net tree, which is what
these patches are.

The changes are:

 1. Fix inet_shutdown races when L2TP tunnels and sessions close. (patches 1-2)
 2. Fix races with tunnel and its socket. (patch 3)
 3. Fix race in pppol2tp_release with session and its socket. (patch 4)
 4. Fix tunnel lookup use-after-free. (patch 5)

All of the syzbot reproducers hit races in the tunnel and pppol2tp
session create and destroy paths. These tests create and destroy
pppol2tp tunnels and sessions rapidly using multiple threads,
provoking races in several tunnel/session create/destroy paths. The
key problem was that each tunnel/session socket could be destroyed
while its associated tunnel/session object still existed (patches 3,
4). Patch 5 addresses a problem with the way tunnels are removed from
the tunnel list. Patch 5 is tagged that it addresses all four syzbot
issues, though all 5 patches are needed.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44e524cf

l2tp: fix tunnel lookup use-after-free race · 28f5bfb8

由 James Chapman 提交于 2月 23, 2018

l2tp_tunnel_get walks the tunnel list to find a matching tunnel
instance and if a match is found, its refcount is increased before
returning the tunnel pointer. But when tunnel objects are destroyed,
they are on the tunnel list after their refcount hits zero. Fix this
by moving the code that removes the tunnel from the tunnel list from
the tunnel socket destructor into in the l2tp_tunnel_delete path,
before the tunnel refcount is decremented.

refcount_t: increment on 0; use-after-free.
WARNING: CPU: 3 PID: 13507 at lib/refcount.c:153 refcount_inc+0x47/0x50
Modules linked in:
CPU: 3 PID: 13507 Comm: syzbot_6e6a5ec8 Not tainted 4.16.0-rc2+ #36
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:refcount_inc+0x47/0x50
RSP: 0018:ffff8800136ffb20 EFLAGS: 00010286
RAX: dffffc0000000008 RBX: ffff880017068e68 RCX: ffffffff814d3333
RDX: 0000000000000000 RSI: ffff88001a59f6d8 RDI: ffff88001a59f6d8
RBP: ffff8800136ffb28 R08: 0000000000000000 R09: 0000000000000000
R10: ffff8800136ffab0 R11: 0000000000000000 R12: ffff880017068e50
R13: 0000000000000000 R14: ffff8800174da800 R15: 0000000000000004
FS:  00007f403ab1e700(0000) GS:ffff88001a580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000205fafd2 CR3: 0000000016770000 CR4: 00000000000006e0
Call Trace:
 l2tp_tunnel_get+0x2dd/0x4e0
 pppol2tp_connect+0x428/0x13c0
 ? pppol2tp_session_create+0x170/0x170
 ? __might_fault+0x115/0x1d0
 ? lock_downgrade+0x860/0x860
 ? __might_fault+0xe5/0x1d0
 ? security_socket_connect+0x8e/0xc0
 SYSC_connect+0x1b6/0x310
 ? SYSC_bind+0x280/0x280
 ? __do_page_fault+0x5d1/0xca0
 ? up_read+0x1f/0x40
 ? __do_page_fault+0x3c8/0xca0
 SyS_connect+0x29/0x30
 ? SyS_accept+0x40/0x40
 do_syscall_64+0x1e0/0x730
 ? trace_hardirqs_off_thunk+0x1a/0x1c
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f403a42f259
RSP: 002b:00007f403ab1dee8 EFLAGS: 00000296 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 00000000205fafe4 RCX: 00007f403a42f259
RDX: 000000000000002e RSI: 00000000205fafd2 RDI: 0000000000000004
RBP: 00007f403ab1df20 R08: 00007f403ab1e700 R09: 0000000000000000
R10: 00007f403ab1e700 R11: 0000000000000296 R12: 0000000000000000
R13: 00007ffc81906cbf R14: 0000000000000000 R15: 00007f403ab2b040
Code: 3b ff 5b 5d c3 e8 ca 5f 3b ff 80 3d 49 8e 66 04 00 75 ea e8 bc 5f 3b ff 48 c7 c7 60 69 64 85 c6 05 34 8e 66 04 01 e8 59 49 15 ff <0f> 0b eb ce 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49

Fixes: f8ccac0e ("l2tp: put tunnel socket release on a workqueue")
Reported-and-tested-by: syzbot+19c09769f14b48810113@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+347bd5acde002e353a36@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+6e6a5ec8de31a94cd015@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+9df43faf09bd400f2993@syzkaller.appspotmail.com
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28f5bfb8

l2tp: fix race in pppol2tp_release with session object destroy · d02ba2a6

由 James Chapman 提交于 2月 23, 2018

pppol2tp_release uses call_rcu to put the final ref on its socket. But
the session object doesn't hold a ref on the session socket so may be
freed while the pppol2tp_put_sk RCU callback is scheduled. Fix this by
having the session hold a ref on its socket until the session is
destroyed. It is this ref that is dropped via call_rcu.

Sessions are also deleted via l2tp_tunnel_closeall. This must now also put
the final ref via call_rcu. So move the call_rcu call site into
pppol2tp_session_close so that this happens in both destroy paths. A
common destroy path should really be implemented, perhaps with
l2tp_tunnel_closeall calling l2tp_session_delete like pppol2tp_release
does, but this will be looked at later.

ODEBUG: activate active (active state 1) object type: rcu_head hint:           (null)
WARNING: CPU: 3 PID: 13407 at lib/debugobjects.c:291 debug_print_object+0x166/0x220
Modules linked in:
CPU: 3 PID: 13407 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #38
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:debug_print_object+0x166/0x220
RSP: 0018:ffff880013647a00 EFLAGS: 00010082
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff814d3333
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88001a59f6d0
RBP: ffff880013647a40 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8800136479a8 R11: 0000000000000000 R12: 0000000000000001
R13: ffffffff86161420 R14: ffffffff85648b60 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88001a580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020e77000 CR3: 0000000006022000 CR4: 00000000000006e0
Call Trace:
 debug_object_activate+0x38b/0x530
 ? debug_object_assert_init+0x3b0/0x3b0
 ? __mutex_unlock_slowpath+0x85/0x8b0
 ? pppol2tp_session_destruct+0x110/0x110
 __call_rcu.constprop.66+0x39/0x890
 ? __call_rcu.constprop.66+0x39/0x890
 call_rcu_sched+0x17/0x20
 pppol2tp_release+0x2c7/0x440
 ? fcntl_setlk+0xca0/0xca0
 ? sock_alloc_file+0x340/0x340
 sock_release+0x92/0x1e0
 sock_close+0x1b/0x20
 __fput+0x296/0x6e0
 ____fput+0x1a/0x20
 task_work_run+0x127/0x1a0
 do_exit+0x7f9/0x2ce0
 ? SYSC_connect+0x212/0x310
 ? mm_update_next_owner+0x690/0x690
 ? up_read+0x1f/0x40
 ? __do_page_fault+0x3c8/0xca0
 do_group_exit+0x10d/0x330
 ? do_group_exit+0x330/0x330
 SyS_exit_group+0x22/0x30
 do_syscall_64+0x1e0/0x730
 ? trace_hardirqs_off_thunk+0x1a/0x1c
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f362e471259
RSP: 002b:00007ffe389abe08 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f362e471259
RDX: 00007f362e471259 RSI: 000000000000002e RDI: 0000000000000000
RBP: 00007ffe389abe30 R08: 0000000000000000 R09: 00007f362e944270
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60
R13: 00007ffe389abf50 R14: 0000000000000000 R15: 0000000000000000
Code: 8d 3c dd a0 8f 64 85 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7b 48 8b 14 dd a0 8f 64 85 4c 89 f6 48 c7 c7 20 85 64 85 e
8 2a 55 14 ff <0f> 0b 83 05 ad 2a 68 04 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41

Fixes: ee40fb2e ("l2tp: protect sock pointer of struct pppol2tp_session with RCU")
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d02ba2a6

openeuler / Kernel 2 年多 前同步成功

openeuler / Kernel
2 年多前同步成功