提交 · e30cb13c5a09ff5f043a6570c32e49b063bea6a1 · openeuler / Kernel

02 8月, 2018 4 次提交

Revert "net/ipv6: fix metrics leak" · e6aed040

由 David S. Miller 提交于 8月 01, 2018

This reverts commit df18b504.

This change causes other problems and use-after-free situations as
found by syzbot.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6aed040

rxrpc: Fix user call ID check in rxrpc_service_prealloc_one · c01f6c9b

由 YueHaibing 提交于 8月 01, 2018

There just check the user call ID isn't already in use, hence should
compare user_call_ID with xcall->user_call_ID, which is current
node's user_call_ID.

Fixes: 540b1c48 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Suggested-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c01f6c9b

net: dsa: Do not suspend/resume closed slave_dev · a94c689e

由 Florian Fainelli 提交于 7月 31, 2018

If a DSA slave network device was previously disabled, there is no need
to suspend or resume it.

Fixes: 24462549 ("net: dsa: allow switch drivers to implement suspend/resume hooks")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a94c689e

netlink: Fix spectre v1 gadget in netlink_create() · bc5b6c0b

由 Jeremy Cline 提交于 7月 31, 2018

'protocol' is a user-controlled value, so sanitize it after the bounds
check to avoid using it for speculative out-of-bounds access to arrays
indexed by it.

This addresses the following accesses detected with the help of smatch:

* net/netlink/af_netlink.c:654 __netlink_create() warn: potential
  spectre issue 'nlk_cb_mutex_keys' [w]

* net/netlink/af_netlink.c:654 __netlink_create() warn: potential
  spectre issue 'nlk_cb_mutex_key_strings' [w]

* net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre
  issue 'nl_table' [w] (local cap)

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NJeremy Cline <jcline@redhat.com>
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc5b6c0b

01 8月, 2018 2 次提交

ipv4: frags: handle possible skb truesize change · 4672694b

由 Eric Dumazet 提交于 7月 30, 2018

ip_frag_queue() might call pskb_pull() on one skb that
is already in the fragment queue.

We need to take care of possible truesize change, or we
might have an imbalance of the netns frags memory usage.

IPv6 is immune to this bug, because RFC5722, Section 4,
amended by Errata ID 3089 states :

  When reassembling an IPv6 datagram, if
  one or more its constituent fragments is determined to be an
  overlapping fragment, the entire datagram (and any constituent
  fragments) MUST be silently discarded.

Fixes: 158f323b ("net: adjust skb->truesize in pskb_expand_head()")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4672694b

inet: frag: enforce memory limits earlier · 56e2c94f

由 Eric Dumazet 提交于 7月 30, 2018

We currently check current frags memory usage only when
a new frag queue is created. This allows attackers to first
consume the memory budget (default : 4 MB) creating thousands
of frag queues, then sending tiny skbs to exceed high_thresh
limit by 2 to 3 order of magnitude.

Note that before commit 648700f7 ("inet: frags: use rhashtables
for reassembly units"), work queue could be starved under DOS,
getting no cpu cycles.
After commit 648700f7, only the per frag queue timer can eventually
remove an incomplete frag queue and its skbs.

Fixes: b13d3cbf ("inet: frag: move eviction of queues to work queue")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJann Horn <jannh@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Oskolkov <posk@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56e2c94f

31 7月, 2018 2 次提交

netlink: Don't shift with UB on nlk->ngroups · 61f4b237

由 Dmitry Safonov 提交于 7月 30, 2018

On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
hang during boot.
Check for 0 ngroups and use (unsigned long long) as a type to shift.

Fixes: 7acf9d42 ("netlink: Do not subscribe to non-existent groups").
Reported-by: Nkernel test robot <rong.a.chen@intel.com>
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61f4b237

net/ipv6: fix metrics leak · df18b504

由 Sabrina Dubroca 提交于 7月 30, 2018

Since commit d4ead6b3 ("net/ipv6: move metrics from dst to
rt6_info"), ipv6 metrics are shared and refcounted. rt6_set_from()
assigns the rt->from pointer and increases the refcount on from's
metrics. This reference is never released.

Introduce the fib6_metrics_release() helper and use it to release the
metrics.

Fixes: d4ead6b3 ("net/ipv6: move metrics from dst to rt6_info")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df18b504

30 7月, 2018 2 次提交

openvswitch: meter: Fix setting meter id for new entries · 25432eba

由 Justin Pettit 提交于 7月 28, 2018

The meter code would create an entry for each new meter.  However, it
would not set the meter id in the new entry, so every meter would appear
to have a meter id of zero.  This commit properly sets the meter id when
adding the entry.

Fixes: 96fbc13d ("openvswitch: Add meter infrastructure")
Signed-off-by: NJustin Pettit <jpettit@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25432eba

netlink: Do not subscribe to non-existent groups · 7acf9d42

由 Dmitry Safonov 提交于 7月 27, 2018

Make ABI more strict about subscribing to group > ngroups.
Code doesn't check for that and it looks bogus.
(one can subscribe to non-existing group)
Still, it's possible to bind() to all possible groups with (-1)

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Signed-off-by: NDmitry Safonov <dima@arista.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7acf9d42

29 7月, 2018 6 次提交

tcp_bbr: fix bw probing to raise in-flight data for very small BDPs · 383d4709

由 Neal Cardwell 提交于 7月 27, 2018

For some very small BDPs (with just a few packets) there was a
quantization effect where the target number of packets in flight
during the super-unity-gain (1.25x) phase of gain cycling was
implicitly truncated to a number of packets no larger than the normal
unity-gain (1.0x) phase of gain cycling. This meant that in multi-flow
scenarios some flows could get stuck with a lower bandwidth, because
they did not push enough packets inflight to discover that there was
more bandwidth available. This was really only an issue in multi-flow
LAN scenarios, where RTTs and BDPs are low enough for this to be an
issue.

This fix ensures that gain cycling can raise inflight for small BDPs
by ensuring that in PROBE_BW mode target inflight values with a
super-unity gain are always greater than inflight values with a gain
<= 1. Importantly, this applies whether the inflight value is
calculated for use as a cwnd value, or as a target inflight value for
the end of the super-unity phase in bbr_is_next_cycle_phase() (both
need to be bigger to ensure we can probe with more packets in flight
reliably).

This is a candidate fix for stable releases.

Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control")
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NPriyaranjan Jha <priyarjha@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

383d4709

net: socket: Fix potential spectre v1 gadget in sock_is_registered · e978de7a

由 Jeremy Cline 提交于 7月 27, 2018

'family' can be a user-controlled value, so sanitize it after the bounds
check to avoid speculative out-of-bounds access.

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJeremy Cline <jcline@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e978de7a

net: socket: fix potential spectre v1 gadget in socketcall · c8e8cd57

由 Jeremy Cline 提交于 7月 27, 2018

'call' is a user-controlled value, so sanitize the array index after the
bounds check to avoid speculating past the bounds of the 'nargs' array.

Found with the help of Smatch:

net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue
'nargs' [r] (local cap)

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJeremy Cline <jcline@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8e8cd57

ipv4: remove BUG_ON() from fib_compute_spec_dst · 9fc12023

由 Lorenzo Bianconi 提交于 7月 27, 2018

Remove BUG_ON() from fib_compute_spec_dst routine and check
in_dev pointer during flowi4 data structure initialization.
fib_compute_spec_dst routine can be run concurrently with device removal
where ip_ptr net_device pointer is set to NULL. This can happen
if userspace enables pkt info on UDP rx socket and the device
is removed while traffic is flowing

Fixes: 35ebf65e ("ipv4: Create and use fib_compute_spec_dst() helper")
Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9fc12023

bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog() · 71eb5255

由 Taehee Yoo 提交于 7月 29, 2018

bpf_parse_prog() is protected by rcu_read_lock().
so that GFP_KERNEL is not allowed in the bpf_parse_prog().

[51015.579396] =============================
[51015.579418] WARNING: suspicious RCU usage
[51015.579444] 4.18.0-rc6+ #208 Not tainted
[51015.579464] -----------------------------
[51015.579488] ./include/linux/rcupdate.h:303 Illegal context switch in RCU read-side critical section!
[51015.579510] other info that might help us debug this:
[51015.579532] rcu_scheduler_active = 2, debug_locks = 1
[51015.579556] 2 locks held by ip/1861:
[51015.579577]  #0: 00000000a8c12fd1 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x2e0/0x910
[51015.579711]  #1: 00000000bf815f8e (rcu_read_lock){....}, at: lwtunnel_build_state+0x96/0x390
[51015.579842] stack backtrace:
[51015.579869] CPU: 0 PID: 1861 Comm: ip Not tainted 4.18.0-rc6+ #208
[51015.579891] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[51015.579911] Call Trace:
[51015.579950]  dump_stack+0x74/0xbb
[51015.580000]  ___might_sleep+0x16b/0x3a0
[51015.580047]  __kmalloc_track_caller+0x220/0x380
[51015.580077]  kmemdup+0x1c/0x40
[51015.580077]  bpf_parse_prog+0x10e/0x230
[51015.580164]  ? kasan_kmalloc+0xa0/0xd0
[51015.580164]  ? bpf_destroy_state+0x30/0x30
[51015.580164]  ? bpf_build_state+0xe2/0x3e0
[51015.580164]  bpf_build_state+0x1bb/0x3e0
[51015.580164]  ? bpf_parse_prog+0x230/0x230
[51015.580164]  ? lock_is_held_type+0x123/0x1a0
[51015.580164]  lwtunnel_build_state+0x1aa/0x390
[51015.580164]  fib_create_info+0x1579/0x33d0
[51015.580164]  ? sched_clock_local+0xe2/0x150
[51015.580164]  ? fib_info_update_nh_saddr+0x1f0/0x1f0
[51015.580164]  ? sched_clock_local+0xe2/0x150
[51015.580164]  fib_table_insert+0x201/0x1990
[51015.580164]  ? lock_downgrade+0x610/0x610
[51015.580164]  ? fib_table_lookup+0x1920/0x1920
[51015.580164]  ? lwtunnel_valid_encap_type.part.6+0xcb/0x3a0
[51015.580164]  ? rtm_to_fib_config+0x637/0xbd0
[51015.580164]  inet_rtm_newroute+0xed/0x1b0
[51015.580164]  ? rtm_to_fib_config+0xbd0/0xbd0
[51015.580164]  rtnetlink_rcv_msg+0x331/0x910
[ ... ]

Fixes: 3a0af8fd ("bpf: BPF for lightweight tunnel infrastructure")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

71eb5255

bpf: fix bpf_skb_load_bytes_relative pkt length check · 3eee1f75

由 Daniel Borkmann 提交于 7月 28, 2018

The len > skb_headlen(skb) cannot be used as a maximum upper bound
for the packet length since it does not have any relation to the full
linear packet length when filtering is used from upper layers (e.g.
in case of reuseport BPF programs) as by then skb->data, skb->len
already got mangled through __skb_pull() and others.

Fixes: 4e1ec56c ("bpf: add skb_load_bytes_relative helper")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>

3eee1f75

27 7月, 2018 3 次提交

xdp: add NULL pointer check in __xdp_return() · 36e0f12b

由 Taehee Yoo 提交于 7月 26, 2018

rhashtable_lookup() can return NULL. so that NULL pointer
check routine should be added.

Fixes: 02b55e56 ("xdp: add MEM_TYPE_ZERO_COPY")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

36e0f12b

RDS: RDMA: Fix the NULL-ptr deref in rds_ib_get_mr · 9e630bcb

由 Avinash Repaka 提交于 7月 24, 2018

Registration of a memory region(MR) through FRMR/fastreg(unlike FMR)
needs a connection/qp. With a proxy qp, this dependency on connection
will be removed, but that needs more infrastructure patches, which is a
work in progress.

As an intermediate fix, the get_mr returns EOPNOTSUPP when connection
details are not populated. The MR registration through sendmsg() will
continue to work even with fast registration, since connection in this
case is formed upfront.

This patch fixes the following crash:
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 4244 Comm: syzkaller468044 Not tainted 4.16.0-rc6+ #361
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544
RSP: 0018:ffff8801b059f890 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8801b07e1300 RCX: ffffffff8562d96e
RDX: 000000000000000d RSI: 0000000000000001 RDI: 0000000000000068
RBP: ffff8801b059f8b8 R08: ffffed0036274244 R09: ffff8801b13a1200
R10: 0000000000000004 R11: ffffed0036274243 R12: ffff8801b13a1200
R13: 0000000000000001 R14: ffff8801ca09fa9c R15: 0000000000000000
FS:  00007f4d050af700(0000) GS:ffff8801db300000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4d050aee78 CR3: 00000001b0d9b006 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __rds_rdma_map+0x710/0x1050 net/rds/rdma.c:271
 rds_get_mr_for_dest+0x1d4/0x2c0 net/rds/rdma.c:357
 rds_setsockopt+0x6cc/0x980 net/rds/af_rds.c:347
 SYSC_setsockopt net/socket.c:1849 [inline]
 SyS_setsockopt+0x189/0x360 net/socket.c:1828
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4456d9
RSP: 002b:00007f4d050aedb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 00000000004456d9
RDX: 0000000000000007 RSI: 0000000000000114 RDI: 0000000000000004
RBP: 00000000006dac38 R08: 00000000000000a0 R09: 0000000000000000
R10: 0000000020000380 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffbfb36d6f R14: 00007f4d050af9c0 R15: 0000000000000005
Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 cc 01 00 00 4c 8b bb 80 04 00 00
48
b8 00 00 00 00 00 fc ff df 49 8d 7f 68 48 89 fa 48 c1 ea 03 <80> 3c 02
00 0f
85 9c 01 00 00 4d 8b 7f 68 48 b8 00 00 00 00 00
RIP: rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544 RSP:
ffff8801b059f890
---[ end trace 7e1cea13b85473b0 ]---

Reported-by: syzbot+b51c77ef956678a65834@syzkaller.appspotmail.com
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NAvinash Repaka <avinash.repaka@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e630bcb

net: rollback orig value on failure of dev_qdisc_change_tx_queue_len · 7effaf06

由 Tariq Toukan 提交于 7月 24, 2018

Fix dev_change_tx_queue_len so it rolls back original value
upon a failure in dev_qdisc_change_tx_queue_len.
This is already done for notifirers' failures, share the code.

In case of failure in dev_qdisc_change_tx_queue_len, some tx queues
would still be of the new length, while they should be reverted.
Currently, the revert is not done, and is marked with a TODO label
in dev_qdisc_change_tx_queue_len, and should find some nice solution
to do it.
Yet it is still better to not apply the newly requested value.

Fixes: 48bfd55e ("net_sched: plug in qdisc ops change_tx_queue_len")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
Reported-by: NRan Rozenstein <ranro@mellanox.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7effaf06

26 7月, 2018 3 次提交

xsk: fix poll/POLLIN premature returns · d24458e4

由 Björn Töpel 提交于 7月 23, 2018

Polling for the ingress queues relies on reading the producer/consumer
pointers of the Rx queue.

Prior this commit, a cached consumer pointer could be used, instead of
the actual consumer pointer and therefore report POLLIN prematurely.

This patch makes sure that the non-cached consumer pointer is used
instead.
Reported-by: NQi Zhang <qi.z.zhang@intel.com>
Tested-by: NQi Zhang <qi.z.zhang@intel.com>
Fixes: c497176c ("xsk: add Rx receive functions and poll support")
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d24458e4

net: igmp: make function __ip_mc_inc_group() static · b87bac10

由 Wei Yongjun 提交于 7月 25, 2018

Fixes the following sparse warnings:

net/ipv4/igmp.c:1391:6: warning:
 symbol '__ip_mc_inc_group' was not declared. Should it be static?

Fixes: 6e2059b5 ("ipv4/igmp: init group mode as INCLUDE when join source group")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b87bac10

tcp: ack immediately when a cwr packet arrives · 9aee4000

由 Lawrence Brakmo 提交于 7月 23, 2018

We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The
problem is triggered when the last packet of a request arrives CE
marked. The reply will carry the ECE mark causing TCP to shrink its cwnd
to 1 (because there are no packets in flight). When the 1st packet of
the next request arrives, the ACK was sometimes delayed even though it
is CWR marked, adding up to 40ms to the RPC latency.

This patch insures that CWR marked data packets arriving will be acked
immediately.

Packetdrill script to reproduce the problem:

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001

0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001

0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 2:3(1) ack 2001

0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect01] . 3:3(0) ack 4001

0.210 < [ce] P. 4001:4501(500) ack 3 win 257

+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect01] PE. 3:4(1) ack 4501

+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
// Previously the ACK sequence below would be 4501, causing a long RTO
+0.040~+0.045 > [ect01] . 4:4(0) ack 5501   // delayed ack

+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257  // More data
+0 > [ect01] . 4:4(0) ack 6501     // now acks everything

+0.500 < F. 9501:9501(0) ack 4 win 257

Modified based on comments by Neal Cardwell <ncardwell@google.com>
Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9aee4000

25 7月, 2018 1 次提交

ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull · 2efd4fca

由 Willem de Bruijn 提交于 7月 23, 2018

Syzbot reported a read beyond the end of the skb head when returning
IPV6_ORIGDSTADDR:

  BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
  CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
  Google 01/01/2011
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:113
    kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
    kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
    kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
    copy_to_user include/linux/uaccess.h:184 [inline]
    put_cmsg+0x5ef/0x860 net/core/scm.c:242
    ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
    ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
    rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
    [..]

This logic and its ipv4 counterpart read the destination port from
the packet at skb_transport_offset(skb) + 4.

With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
packet that stores headers exactly up to skb_transport_offset(skb) in
the head and the remainder in a frag.

Call pskb_may_pull before accessing the pointer to ensure that it lies
in skb head.

Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2efd4fca

24 7月, 2018 10 次提交

cfg80211: never ignore user regulatory hint · e31f6456

由 Amar Singhal 提交于 7月 20, 2018

Currently user regulatory hint is ignored if all wiphys
in the system are self managed. But the hint is not ignored
if there is no wiphy in the system. This affects the global
regulatory setting. Global regulatory setting needs to be
maintained so that it can be applied to a new wiphy entering
the system. Therefore, do not ignore user regulatory setting
even if all wiphys in the system are self managed.
Signed-off-by: NAmar Singhal <asinghal@codeaurora.org>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

e31f6456

sock: fix sg page frag coalescing in sk_alloc_sg · 144fe2bf

由 Daniel Borkmann 提交于 7月 23, 2018

Current sg coalescing logic in sk_alloc_sg() (latter is used by tls and
sockmap) is not quite correct in that we do fetch the previous sg entry,
however the subsequent check whether the refilled page frag from the
socket is still the same as from the last entry with prior offset and
length matching the start of the current buffer is comparing always the
first sg list entry instead of the prior one.

Fixes: 3c4d7559 ("tls: kernel TLS support")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NDave Watson <davejwatson@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

144fe2bf

netfilter: nf_tables: move dumper state allocation into ->start · 90fd131a

由 Florian Westphal 提交于 7月 23, 2018

Shaochun Chen points out we leak dumper filter state allocations
stored in dump_control->data in case there is an error before netlink sets
cb_running (after which ->done will be called at some point).

In order to fix this, add .start functions and do the allocations
there.

->done is going to clean up, and in case error occurs before
->start invocation no cleanups need to be done anymore.
Reported-by: Nshaochun chen <cscnull@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

90fd131a

tcp: add tcp_ooo_try_coalesce() helper · 58152ecb