提交 · 84602761ca4495dd409be936dfa93ed20c946684 · openanolis / cloud-kernel

30 12月, 2013 1 次提交

tipc: fix deadlock during socket release · 84602761

由 Ying Xue 提交于 12月 27, 2013

A deadlock might occur if name table is withdrawn in socket release
routine, and while packets are still being received from bearer.

       CPU0                       CPU1
T0:   recv_msg()               release()
T1:   tipc_recv_msg()          tipc_withdraw()
T2:   [grab node lock]         [grab port lock]
T3:   tipc_link_wakeup_ports() tipc_nametbl_withdraw()
T4:   [grab port lock]*        named_cluster_distribute()
T5:   wakeupdispatch()         tipc_link_send()
T6:                            [grab node lock]*

The opposite order of holding port lock and node lock on above two
different paths may result in a deadlock. If socket lock instead of
port lock is used to protect port instance in tipc_withdraw(), the
reverse order of holding port lock and node lock will be eliminated,
as a result, the deadlock is killed as well.
Reported-by: NLars Everbrand <lars.everbrand@ericsson.com>
Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84602761

29 12月, 2013 1 次提交

netfilter: nf_tables: fix wrong datatype in nft_validate_data_load() · 2ee0d3c8

由 Pablo Neira Ayuso 提交于 12月 28, 2013

This patch fixes dictionary mappings, eg.

 add rule ip filter input meta dnat set tcp dport map { 22 => 1.1.1.1, 23 => 2.2.2.2 }

The kernel was returning -EINVAL in nft_validate_data_load() since
the type of the set element data that is passed was the real userspace
datatype instead of NFT_DATA_VALUE.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2ee0d3c8

28 12月, 2013 12 次提交

batman-adv: fix vlan header access · 2b1e2cb3

由 Antonio Quartulli 提交于 12月 23, 2013

When batadv_get_vid() is invoked in interface_rx() the
batman-adv header has already been removed, therefore
the header_len argument has to be 0.

Introduced by c018ad3d
("batman-adv: add the VLAN ID attribute to the TT entry")
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

2b1e2cb3

batman-adv: clean nf state when removing protocol header · 55883fd1

由 Antonio Quartulli 提交于 12月 23, 2013

If an interface enslaved into batman-adv is a bridge (or a
virtual interface built on top of a bridge) the nf_bridge
member of the skbs reaching the soft-interface is filled
with the state about "netfilter bridge" operations.

Then, if one of such skbs is locally delivered, the nf_bridge
member should be cleaned up to avoid that the old state
could mess up with other "netfilter bridge" operations when
entering a second bridge.
This is needed because batman-adv is an encapsulation
protocol.

However at the moment skb->nf_bridge is not released at all
leading to bogus "netfilter bridge" behaviours.

Fix this by cleaning the netfilter state of the skb before
it gets delivered to the upper layer in interface_rx().
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

55883fd1

batman-adv: fix alignment for batadv_tvlv_tt_change · ca663046

由 Antonio Quartulli 提交于 12月 15, 2013

Make struct batadv_tvlv_tt_change a multiple 4 bytes long
to avoid padding on any architecture.
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

ca663046

batman-adv: fix size of batadv_bla_claim_dst · 2f7a3182

由 Simon Wunderlich 提交于 12月 02, 2013

Since this is a mac address and always 48 bit, and we can assume that
it is always aligned to 2-byte boundaries, add a pack(2) pragma.
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

2f7a3182

batman-adv: fix size of batadv_icmp_header · 27a417e6

由 Antonio Quartulli 提交于 12月 05, 2013

struct batadv_icmp_header currently has a size of 17, which
will be padded to 20 on some architectures. Fix this by
unrolling the header into the parent structures.

Moreover keep the ICMP parsing functions as generic as they
are now by using a stub icmp_header struct during packet
parsing.
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

27a417e6

batman-adv: fix header alignment by unrolling batadv_header · a40d9b07

由 Simon Wunderlich 提交于 12月 02, 2013

The size of the batadv_header of 3 is problematic on some architectures
which automatically pad all structures to a 32 bit boundary. To not lose
performance by packing this struct, better embed it into the various
host structures.
Reported-by: NRussell King <linux@arm.linux.org.uk>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

a40d9b07

batman-adv: fix alignment for batadv_coded_packet · 46b76e0b

由 Simon Wunderlich 提交于 12月 02, 2013

The compiler may decide to pad the structure, and then it does not
have the expected size of 46 byte. Fix this by moving it in the
pragma pack(2) part of the code.
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

46b76e0b

netfilter: nf_tables: fix oops when updating table with user chains · d2012975

由 Pablo Neira Ayuso 提交于 12月 27, 2013

This patch fixes a crash while trying to deactivate a table that
contains user chains. You can reproduce it via:

% nft add table table1
% nft add chain table1 chain1
% nft-table-upd ip table1 dormant

[  253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f
[  253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0
[  253.021208] Oops: 0000 [#1] SMP
[...]
[  253.023305] Call Trace:
[  253.023331]  [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables]
[  253.023385]  [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink]
[  253.023438]  [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink]
[  253.023491]  [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink]
[  253.023542]  [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88
[  253.023586]  [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink]
[  253.023638]  [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34
[  253.023683]  [<ffffffff8134af17>] netlink_unicast+0xe2/0x161
[  253.023727]  [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332
[  253.023773]  [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27
[  253.023820]  [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b
[  253.023861]  [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c
[  253.023905]  [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60
[  253.023952]  [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c
[  253.023995]  [<ffffffff81401107>] ? sysret_check+0x1b/0x56
[  253.024039]  [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db
[  253.024090]  [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  253.024141]  [<ffffffff81310caf>] SyS_sendto+0x9/0xb
[  253.026219]  [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b
Reported-by: NAlex Wei <alex.kern.mentor@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d2012975

netfilter: nf_tables: fix dumping with large number of sets · e38195bf

由 Pablo Neira Ayuso 提交于 12月 24, 2013

If not table name is specified, the dumping of the existing sets
may be incomplete with a sufficiently large number of sets and
tables. This patch fixes missing reset of the cursors after
finding the location of the last object that has been included
in the previous multi-part message.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

e38195bf

ipv6: release dst properly in ipip6_tunnel_xmit · 6a9eadcc

由 Li RongQing 提交于 12月 20, 2013

if a dst is not attached to anywhere, it should be released before
exit ipip6_tunnel_xmit, otherwise cause dst memory leakage.

Fixes: 61c1db7f ("ipv6: sit: add GSO/TSO support")
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a9eadcc

net_sched: act: Dont increment refcnt on replace · 1a29321e

由 Jamal Hadi Salim 提交于 12月 23, 2013

 This is a bug fix. The existing code tries to kill many
 birds with one stone: Handling binding of actions to
 filters, new actions and replacing of action
 attributes. A simple test case to illustrate:

XXXX
 moja@fe1:~$ sudo tc actions add action drop index 12
 moja@fe1:~$ actions get action gact index 12
 action order 1: gact action drop
  random type none pass val 0
  index 12 ref 1 bind 0
 moja@fe1:~$ sudo tc actions replace action ok index 12
 moja@fe1:~$ actions get action gact index 12
 action order 1: gact action drop
  random type none pass val 0
  index 12 ref 2 bind 0
XXXX

The above shows the refcounf being wrongly incremented on replace.
There are more complex scenarios with binding of actions to filters
that i am leaving out that didnt work as well...
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a29321e

rds: prevent dereference of a NULL device · c2349758

由 Sasha Levin 提交于 12月 18, 2013

Binding might result in a NULL device, which is dereferenced
causing this BUG:

[ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097
4
[ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0
[ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1317.264179] Dumping ftrace buffer:
[ 1317.264774]    (ftrace buffer empty)
[ 1317.265220] Modules linked in:
[ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G        W    3.13.0-rc4-
next-20131218-sasha-00013-g2cebb9b-dirty #4159
[ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000
[ 1317.268399] RIP: 0010:[<ffffffff84225f52>]  [<ffffffff84225f52>] rds_ib_laddr_check+
0x82/0x110
[ 1317.269670] RSP: 0000:ffff8803cd31bdf8  EFLAGS: 00010246
[ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000
[ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286
[ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000
[ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031
[ 1317.270230] FS:  00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000
0000
[ 1317.270230] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0
[ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
[ 1317.270230] Stack:
[ 1317.270230]  0000000054086700 5408670000a25de0 5408670000000002 0000000000000000
[ 1317.270230]  ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160
[ 1317.270230]  ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280
[ 1317.270230] Call Trace:
[ 1317.270230]  [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0
[ 1317.270230]  [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0
[ 1317.270230]  [<ffffffff8421c9c3>] rds_bind+0x73/0xf0
[ 1317.270230]  [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0
[ 1317.270230]  [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0
[ 1317.270230]  [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10
[ 1317.270230]  [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290
[ 1317.270230]  [<ffffffff83e4cece>] SyS_bind+0xe/0x10
[ 1317.270230]  [<ffffffff843a6ad0>] tracesys+0xdd/0xe2
[ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00
89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7
4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02
[ 1317.270230] RIP  [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.270230]  RSP <ffff8803cd31bdf8>
[ 1317.270230] CR2: 0000000000000974
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2349758

27 12月, 2013 2 次提交

ipvs: correct usage/allocation of seqadj ext in ipvs · b25adce1

由 Jesper Dangaard Brouer 提交于 12月 16, 2013

The IPVS FTP helper ip_vs_ftp could trigger an OOPS in nf_ct_seqadj_set,
after commit 41d73ec0 (netfilter: nf_conntrack: make sequence number
adjustments usuable without NAT).

This is because, the seqadj ext is now allocated dynamically, and the
IPVS code didn't handle this situation.  Fix this in the IPVS nfct
code by invoking the alloc function nfct_seqadj_ext_add().

Fixes: 41d73ec0 (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT)
Suggested-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

b25adce1

netfilter: WARN about wrong usage of sequence number adjustments · db12cf27

由 Jesper Dangaard Brouer 提交于 12月 16, 2013

Since commit 41d73ec0 (netfilter: nf_conntrack: make sequence
number adjustments usuable without NAT), the sequence number extension
is dynamically allocated.

Instead of dying, give a WARN splash, in case of wrong usage of the
seqadj code, e.g. when forgetting to allocate via nfct_seqadj_ext_add().

Wrong usage have been seen in the IPVS code path.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

db12cf27

23 12月, 2013 1 次提交

ipv4: consistent reporting of pmtu data in case of corking · 61e7f09d

由 Hannes Frederic Sowa 提交于 12月 19, 2013

We report different pmtu values back on the first write and on further
writes on an corked socket.

Also don't include the dst.header_len (respectively exthdrlen) as this
should already be dealt with by the interface mtu of the outgoing
(virtual) interface and policy of that interface should dictate if
fragmentation should happen.

Instead reduce the pmtu data by IP options as we do for IPv6. Make the
same changes for ip_append_data, where we did not care about options or
dst.header_len at all.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61e7f09d

20 12月, 2013 5 次提交

netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletion · 443d20fd

由 Helmut Schaa 提交于 12月 20, 2013

When having nf_conntrack_timestamp enabled deleting a netns
can lead to the following BUG being triggered:

[63836.660000] Kernel bug detected[#1]:
[63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 #14
[63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000
[63836.660000] $ 0   : 00000000 00000000 00000000 00000000
[63836.660000] $ 4   : 00000001 00000004 00000020 00000020
[63836.660000] $ 8   : 00000000 80064910 00000000 00000000
[63836.660000] $12   : 0bff0002 00000001 00000000 0a0a0abe
[63836.660000] $16   : 802e70a0 85f29d80 00000000 00000004
[63836.660000] $20   : 85fb62a0 00000002 802d3bc0 85fb62a0
[63836.660000] $24   : 00000000 87138110
[63836.660000] $28   : 802d2000 802d3b40 00000014 871327cc
[63836.660000] Hi    : 000005ff
[63836.660000] Lo    : f2edd000
[63836.660000] epc   : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack]
[63836.660000]     Not tainted
[63836.660000] ra    : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack]
[63836.660000] Status: 1100d403 KERNEL EXL IE
[63836.660000] Cause : 00800034
[63836.660000] PrId  : 0001974c (MIPS 74Kc)
[63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh
[63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000)
[63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000
[63836.660000] Call Trace:
[63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack]
[63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack]
[63836.660000] [<801ff63c>] nf_iterate+0x90/0xec
[63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164
[63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c
[63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4
[63836.660000] [<801da124>] process_backlog+0xa4/0x16c
[63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0
[63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc
[63836.660000] [<8007c730>] do_softirq+0x48/0x68
[63836.660000] [<8007c964>] irq_exit+0x54/0x70
[63836.660000] [<80060830>] ret_from_irq+0x0/0x4
[63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c
[63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104
[63836.660000] [<802eb918>] start_kernel+0x394/0x3ac
[63836.660000]
[63836.660000]
Code: 00821021  8c420000  2c440001 <00040336> 90440011  92350010  90560010  2485ffff  02a5a821
[63837.040000] ---[ end trace ebf660c3ce3b55e7 ]---
[63837.050000] Kernel panic - not syncing: Fatal exception in interrupt
[63837.050000] Rebooting in 3 seconds..

Fix this by not unregistering the conntrack extension in the per-netns
cleanup code.

This bug was introduced in (73f4001a netfilter: nf_ct_tstamp: move
initialization out of pernet_operations).
Signed-off-by: NHelmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

443d20fd

netfilter: nft_exthdr: call ipv6_find_hdr() with explicitly initialized offset · 540436c8

由 Daniel Borkmann 提交于 12月 20, 2013

In nft's nft_exthdr_eval() routine we process IPv6 extension header
through invoking ipv6_find_hdr(), but we call it with an uninitialized
offset variable that contains some stack value. In ipv6_find_hdr()
we then test if the value of offset != 0 and call skb_header_pointer()
on that offset in order to map struct ipv6hdr into it. Fix it up by
initializing offset to 0 as it was probably intended to be.

Fixes: 96518518 ("netfilter: add nftables")
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

540436c8

dccp: catch failed request_module call in dccp_probe init · 965cdea8

由 Wang Weidong 提交于 12月 17, 2013

Check the return value of request_module during dccp_probe initialisation,
bail out if that call fails.
Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

965cdea8

ipv6: always set the new created dst's from in ip6_rt_copy · 24f5b855

由 Li RongQing 提交于 12月 19, 2013

ip6_rt_copy only sets dst.from if ort has flag RTF_ADDRCONF and RTF_DEFAULT.
but the prefix routes which did get installed by hand locally can have an
expiration, and no any flag combination which can ensure a potential from
does never expire, so we should always set the new created dst's from.

This also fixes the new created dst is always expired since the ort, which
is created by RA, maybe has RTF_EXPIRES and RTF_ADDRCONF, but no RTF_DEFAULT.
Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
CC: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24f5b855

net: inet_diag: zero out uninitialized idiag_{src,dst} fields · b1aac815

由 Daniel Borkmann 提交于 12月 17, 2013

Jakub reported while working with nlmon netlink sniffer that parts of
the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6.
That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3].

In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab]
memory through this. At least, in udp_dump_one(), we allocate a skb in ...

  rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL);

... and then pass that to inet_sk_diag_fill() that puts the whole struct
inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0],
r->id.idiag_dst[0] and leave the rest untouched:

  r->id.idiag_src[0] = inet->inet_rcv_saddr;
  r->id.idiag_dst[0] = inet->inet_daddr;

struct inet_diag_msg embeds struct inet_diag_sockid that is correctly /
fully filled out in IPv6 case, but for IPv4 not.

So just zero them out by using plain memset (for this little amount of
bytes it's probably not worth the extra check for idiag_family == AF_INET).

Similarly, fix also other places where we fill that out.
Reported-by: NJakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1aac815

19 12月, 2013 3 次提交

ipv6: sit: update mtu check to take care of gso packets · 58a47824

由 Eric Dumazet 提交于 12月 16, 2013

While testing my changes for TSO support in SIT devices,
I was using sit0 tunnel which appears to include nopmtudisc flag.

But using :

ip tun add sittun mode sit remote $REMOTE_IPV4 local $LOCAL_IPV4 \
   dev $IFACE

We get a tunnel which rejects too long packets because of the mtu check
which is not yet GSO aware.

erd:~# ip tunnel
sittun: ipv6/ip  remote 10.246.17.84  local 10.246.17.83  ttl inherit  6rd-prefix 2002::/16
sit0: ipv6/ip  remote any  local any  ttl 64  nopmtudisc 6rd-prefix 2002::/16

This patch is based on an excellent report from
Michal Shmidt.

In the future, we probably want to extend the MTU check to do the
right thing for GSO packets...

Fixes: ("61c1db7f ipv6: sit: add GSO/TSO support")
Reported-by: NMichal Schmidt <mschmidt@redhat.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Tested-by: NMichal Schmidt <mschmidt@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58a47824

ipv6: pmtudisc setting not respected with UFO/CORK · 4df98e76

由 Hannes Frederic Sowa 提交于 12月 16, 2013

Sockets marked with IPV6_PMTUDISC_PROBE (or later IPV6_PMTUDISC_INTERFACE)
don't respect this setting when the outgoing interface supports UFO.

We had the same problem in IPv4, which was fixed in commit
daba287b ("ipv4: fix DO and PROBE pmtu
mode regarding local fragmentation with UFO/CORK").

Also IPV6_DONTFRAG mode did not care about already corked data, thus
it may generate a fragmented frame even if this socket option was
specified. It also did not care about the length of the ipv6 header and
possible options.

In the error path allow the user to receive the pmtu notifications via
both, rxpmtu method or error queue. The user may opted in for both,
so deliver the notification to both error handlers (the handlers check
if the error needs to be enqueued).

Also report back consistent pmtu values when sending on an already
cork-appended socket.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4df98e76

ip_gre: fix msg_name parsing for recvfrom/recvmsg · 0e3da5bb

由 Timo Teräs 提交于 12月 16, 2013

ipgre_header_parse() needs to parse the tunnel's ip header and it
uses mac_header to locate the iphdr. This got broken when gre tunneling
was refactored as mac_header is no longer updated to point to iphdr.
Introduce skb_pop_mac_header() helper to do the mac_header assignment
and use it in ipgre_rcv() to fix msg_name parsing.

Bug introduced in commit c5441932 (GRE: Refactor GRE tunneling code.)

Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e3da5bb

18 12月, 2013 5 次提交

net: allow netdev_all_upper_get_next_dev_rcu with rtnl lock held · 85328240

由 John Fastabend 提交于 11月 26, 2013

It is useful to be able to walk all upper devices when bringing
a device online where the RTNL lock is held. In this case it
is safe to walk the all_adj_list because the RTNL lock is used
to protect the write side as well.

This patch adds a check to see if the rtnl lock is held before
throwing a warning in netdev_all_upper_get_next_dev_rcu().

Also because we now have a call site for lockdep_rtnl_is_held()
outside COFIG_LOCK_PROVING an inline definition returning 1 is
needed. Similar to the rcu_read_lock_is_held().

Fixes: 2a47fa45 ("ixgbe: enable l2 forwarding acceleration for macvlans")
CC: Veaceslav Falico <vfalico@redhat.com>
Reported-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

85328240

netfilter: nfnetlink_log: unset nf_loggers for netns when unloading module · 45c2aff6

由 Gao feng 提交于 12月 16, 2013

Steven Rostedt and Arnaldo Carvalho de Melo reported a panic
when access the files /proc/sys/net/netfilter/nf_log/*.

This problem will occur when we do:

 echo nfnetlink_log > /proc/sys/net/netfilter/nf_log/any_file
 rmmod nfnetlink_log

and then access the files.

Since the nf_loggers of netns hasn't been unset, it will point
to the memory that has been freed.

This bug is introduced by commit 9368a53c ("netfilter: nfnetlink_log:
add net namespace support for nfnetlink_log").

[17261.822047] BUG: unable to handle kernel paging request at ffffffffa0d49090
[17261.822056] IP: [<ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0
[...]
[17261.822226] Call Trace:
[17261.822235]  [<ffffffff81297b98>] ? security_capable+0x18/0x20
[17261.822240]  [<ffffffff8106fa09>] ? ns_capable+0x29/0x50
[17261.822247]  [<ffffffff8163d25f>] ? net_ctl_permissions+0x1f/0x90
[17261.822254]  [<ffffffff81216613>] proc_sys_call_handler+0xb3/0xc0
[17261.822258]  [<ffffffff81216651>] proc_sys_read+0x11/0x20
[17261.822265]  [<ffffffff811a80de>] vfs_read+0x9e/0x170
[17261.822270]  [<ffffffff811a8c09>] SyS_read+0x49/0xa0
[17261.822276]  [<ffffffff810e6496>] ? __audit_syscall_exit+0x1f6/0x2a0
[17261.822283]  [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b
[17261.822285] Code: cc 81 4d 63 e4 4c 89 45 88 48 89 4d 90 e8 19 03 0d 00 4b 8b 84 e5 28 08 00 00 48 8b 4d 90 4c 8b 45 88 48 85 c0 0f 84 a8 00 00 00 <48> 8b 40 10 48 89 43 08 48 89 df 4c 89 f2 31 f6 e8 4b 35 af ff
[17261.822329] RIP  [<ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0
[17261.822334]  RSP <ffff880274d3fe28>
[17261.822336] CR2: ffffffffa0d49090
[17261.822340] ---[ end trace a14ce54c0897a90d ]---
Reported-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

45c2aff6

neigh: Netlink notification for administrative NUD state change · 53385d2d

由 Bob Gilligan 提交于 12月 15, 2013

The neighbour code sends up an RTM_NEWNEIGH netlink notification if
the NUD state of a neighbour cache entry is changed by a timer (e.g.
from REACHABLE to STALE), even if the lladdr of the entry has not
changed.

But an administrative change to the the NUD state of a neighbour cache
entry that does not change the lladdr (e.g. via "ip -4 neigh change
...  nud ...") does not trigger a netlink notification.  This means
that netlink listeners will not hear about administrative NUD state
changes such as from a resolved state to PERMANENT.

This patch changes the neighbor code to generate an RTM_NEWNEIGH
message when the NUD state of an entry is changed administratively.
Signed-off-by: NBob Gilligan <gilligan@aristanetworks.com>
Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53385d2d

net: unix: allow bind to fail on mutex lock · 37ab4fa7

由 Sasha Levin 提交于 12月 13, 2013

This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.

This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37ab4fa7

udp: ipv4: do not use sk_dst_lock from softirq context · e47eb5df

由 Eric Dumazet 提交于 12月 15, 2013

Using sk_dst_lock from softirq context is not supported right now.

Instead of adding BH protection everywhere,
udp_sk_rx_dst_set() can instead use xchg(), as suggested
by David.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Fixes: 97502231 ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e47eb5df

17 12月, 2013 2 次提交

Bluetooth: Fix HCI User Channel permission check in hci_sock_sendmsg · 1bc5ad16

由 Marcel Holtmann 提交于 12月 17, 2013

The HCI User Channel is an admin operation which enforces CAP_NET_ADMIN
when binding the socket. Problem now is that it then requires also
CAP_NET_RAW when calling into hci_sock_sendmsg. This is not intended
and just an oversight since general HCI sockets (which do not require
special permission to bind) and HCI User Channel share the same code
path here.

Remove the extra CAP_NET_RAW check for HCI User Channel write operation
since the permission check has already been enforced when binding the
socket. This also makes it possible to open HCI User Channel from a
privileged process and then hand the file descriptor to an unprivilged
process.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Tested-by: NSamuel Ortiz <sameo@linux.intel.com>
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>

1bc5ad16

sctp: loading sctp when load sctp_probe · 9bd7d20c

由 wangweidong 提交于 12月 13, 2013

when I modprobe sctp_probe, it failed with "FATAL: ". I found that
sctp should load before sctp_probe register jprobe. So I add a
sctp_setup_jprobe for loading 'sctp' when first failed to register
jprobe, just do this similar to dccp_probe.

v2: add MODULE_SOFTDEP and check of request_module, as suggested by Neil
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bd7d20c

16 12月, 2013 1 次提交

radiotap: fix bitmap-end-finding buffer overrun · bd02cd25

由 Johannes Berg 提交于 12月 16, 2013

Evan Huus found (by fuzzing in wireshark) that the radiotap
iterator code can access beyond the length of the buffer if
the first bitmap claims an extension but then there's no
data at all. Fix this.

Cc: stable@vger.kernel.org
Reported-by: NEvan Huus <eapache@gmail.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

bd02cd25

12 12月, 2013 6 次提交

netfilter: nft_reject: fix endianness in dump function · a3adadf3

由 Eric Leblond 提交于 12月 12, 2013

The dump function in nft_reject_ipv4 was not converting a u32
field to network order before sending it to userspace, this
needs to happen for consistency with other nf_tables and
nfnetlink subsystems.
Signed-off-by: NEric Leblond <eric@regit.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a3adadf3

sch_tbf: use do_div() for 64-bit divide · d55d282e

由 Yang Yingliang 提交于 12月 12, 2013

It's doing a 64-bit divide which is not supported
on 32-bit architectures in psched_ns_t2l(). The
correct way to do this is to use do_div().

It's introduced by commit cc106e44
("net: sched: tbf: fix the calculation of max_size")
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d55d282e

udp: ipv4: must add synchronization in udp_sk_rx_dst_set() · 97502231

由 Eric Dumazet 提交于 12月 11, 2013

Unlike TCP, UDP input path does not hold the socket lock.

Before messing with sk->sk_rx_dst, we must use a spinlock, otherwise
multiple cpus could leak a refcount.

This patch also takes care of renewing a stale dst entry.
(When the sk->sk_rx_dst would not be used by IP early demux)

Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97502231

udp: ipv4: fix potential use after free in udp_v4_early_demux() · 610438b7

由 Eric Dumazet 提交于 12月 11, 2013

pskb_may_pull() can reallocate skb->head, we need to move the
initialization of iph and uh pointers after its call.

Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

610438b7

net: sched: htb: fix the calculation of quantum · 1598f7cb

由 Yang Yingliang 提交于 12月 10, 2013

Now, 32bit rates may be not the true rate.
So use rate_bytes_ps which is from
max(rate32, rate64) to calcualte quantum.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1598f7cb

net: sched: tbf: fix the calculation of max_size · cc106e44

由 Yang Yingliang 提交于 12月 10, 2013

Current max_size is caluated from rate table. Now, the rate table
has been replaced and it's wrong to caculate max_size based on this
rate table. It can lead wrong calculation of max_size.

The burst in kernel may be lower than user asked, because burst may gets
some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s")
and it seems we cannot avoid this loss. Burst's value(max_size) based on
rate table may be equal user asked. If a packet's length is max_size, this
packet will be stalled in tbf_dequeue() because its length is above the
burst in kernel so that it cannot get enough tokens. The max_size guards
against enqueuing packet sizes above q->buffer "time" in tbf_enqueue().

To make consistent with the calculation of tokens, this patch add a helper
psched_ns_t2l() to calculate burst(max_size) directly to fix this problem.

After this fix, we can support to using 64bit rates to calculate burst as well.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc106e44

11 12月, 2013 1 次提交

netfilter: SYNPROXY target: restrict to INPUT/FORWARD · f01b3926

由 Patrick McHardy 提交于 12月 08, 2013

Fix a crash in synproxy_send_tcp() when using the SYNPROXY target in the
PREROUTING chain caused by missing routing information.
Reported-by: NNicki P. <xastx@gmx.de>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f01b3926

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功