提交 · ba9d114ec5578e6e99a4dfa37ff8ae688040fd64 · openanolis / cloud-kernel

14 11月, 2014 3 次提交

libceph: clear r_req_lru_item in __unregister_linger_request() · ba9d114e

由 Ilya Dryomov 提交于 11月 05, 2014

kick_requests() can put linger requests on the notarget list.  This
means we need to clear the much-overloaded req->r_req_lru_item in
__unregister_linger_request() as well, or we get an assertion failure
in ceph_osdc_release_request() - !list_empty(&req->r_req_lru_item).

AFAICT the assumption was that registered linger requests cannot be on
any of req->r_req_lru_item lists, but that's clearly not the case.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
Reviewed-by: NAlex Elder <elder@linaro.org>

ba9d114e

libceph: unlink from o_linger_requests when clearing r_osd · a390de02

由 Ilya Dryomov 提交于 11月 04, 2014

Requests have to be unlinked from both osd->o_requests (normal
requests) and osd->o_linger_requests (linger requests) lists when
clearing req->r_osd.  Otherwise __unregister_linger_request() gets
confused and we trip over a !list_empty(&osd->o_linger_requests)
assert in __remove_osd().

MON=1 OSD=1:

    # cat remove-osd.sh
    #!/bin/bash
    rbd create --size 1 test
    DEV=$(rbd map test)
    ceph osd out 0
    sleep 3
    rbd map dne/dne # obtain a new osdmap as a side effect
    rbd unmap $DEV & # will block
    sleep 3
    ceph osd in 0
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
Reviewed-by: NAlex Elder <elder@linaro.org>

a390de02

libceph: do not crash on large auth tickets · aaef3170

由 Ilya Dryomov 提交于 10月 23, 2014

Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[   28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.686032] PGD 0
[   28.688088] Oops: 0000 [#1] PREEMPT SMP
[   28.688088] Modules linked in:
[   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   28.688088] Workqueue: ceph-msgr con_work
[   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[   28.688088] RIP: 0010:[<ffffffff81392b42>]  [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
[   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[   28.688088] Stack:
[   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[   28.688088] Call Trace:
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
[   28.688088]  [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
[   28.688088]  [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
[   28.688088]  [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
[   28.688088]  [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
[   28.688088]  [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
[   28.688088]  [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
[   28.688088]  [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
[   28.688088]  [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
[   28.688088]  [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
[   28.688088]  [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
[   28.688088]  [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
[   28.688088]  [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
[   28.688088]  [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
[   28.688088]  [<ffffffff81559289>] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed.  Fix it.

Cc: stable@vger.kernel.org
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

aaef3170

03 11月, 2014 1 次提交

irda: stop calling sk_prot->disconnect() on connection failure · 4cb8c359

由 Linus Torvalds 提交于 11月 02, 2014

The sk_prot is irda's own set of protocol handlers, so irda should
statically know what that function is anyway, without using an indirect
pointer.  And as it happens, we know *exactly* what that pointer is
statically: it's NULL, because irda doesn't define a disconnect
operation.

So calling that function is doubly wrong, and will just cause an oops.
Reported-by: NMartin Lang <mlg.hessigheim@gmail.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4cb8c359

01 11月, 2014 4 次提交

libceph: eliminate unnecessary allocation in process_one_ticket() · e9226d7c

由 Ilya Dryomov 提交于 10月 22, 2014

Commit c27a3e4d ("libceph: do not hard code max auth ticket len")
while fixing a buffer overlow tried to keep the same as much of the
surrounding code as possible and introduced an unnecessary kmalloc() in
the unencrypted ticket path.  It is likely to fail on huge tickets, so
get rid of it.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

e9226d7c

net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0 · e0fb6fb6

由 Guenter Roeck 提交于 10月 30, 2014

If a driver supports reading EEPROM but no EEPROM is installed in the system,
the driver's get_eeprom_len function returns 0. ethtool will subsequently
try to read that zero-length EEPROM anyway. If the driver does not support
EEPROM access at all, this operation will return -EOPNOTSUPP. If the driver
does support EEPROM access but no EEPROM is installed, the operation will
return -EINVAL. Return -EOPNOTSUPP in both cases for consistency.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Tested-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0fb6fb6

mpls: Allow mpls_gso to be built as module · de05c400

由 Pravin B Shelar 提交于 10月 30, 2014

Kconfig already allows mpls to be built as module. Following patch
fixes Makefile to do same.

CC: Simon Horman <simon.horman@netronome.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de05c400

mpls: Fix mpls_gso handler. · f7065f4b

由 Pravin B Shelar 提交于 10月 30, 2014

mpls gso handler needs to pull skb after segmenting skb.

CC: Simon Horman <simon.horman@netronome.com>
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7065f4b

31 10月, 2014 9 次提交

netfilter: nft_reject_bridge: restrict reject to prerouting and input · 127917c2

由 Pablo Neira Ayuso 提交于 10月 27, 2014

Restrict the reject expression to the prerouting and input bridge
hooks. If we allow this to be used from forward or any other later
bridge hook, if the frame is flooded to several ports, we'll end up
sending several reject packets, one per cloned packet.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

127917c2

netfilter: nft_reject_bridge: don't use IP stack to reject traffic · 523b929d

由 Pablo Neira Ayuso 提交于 10月 25, 2014

If the packet is received via the bridge stack, this cannot reject
packets from the IP stack.

This adds functions to build the reject packet and send it from the
bridge stack. Comments and assumptions on this patch:

1) Validate the IPv4 and IPv6 headers before further processing,
   given that the packet comes from the bridge stack, we cannot assume
   they are clean. Truncated packets are dropped, we follow similar
   approach in the existing iptables match/target extensions that need
   to inspect layer 4 headers that is not available. This also includes
   packets that are directed to multicast and broadcast ethernet
   addresses.

2) br_deliver() is exported to inject the reject packet via
   bridge localout -> postrouting. So the approach is similar to what
   we already do in the iptables reject target. The reject packet is
   sent to the bridge port from which we have received the original
   packet.

3) The reject packet is forged based on the original packet. The TTL
   is set based on sysctl_ip_default_ttl for IPv4 and per-net
   ipv6.devconf_all hoplimit for IPv6.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

523b929d

netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions · 8bfcdf66

由 Pablo Neira Ayuso 提交于 10月 26, 2014

That can be reused by the reject bridge expression to build the reject
packet. The new functions are:

* nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header.
* nf_reject_ip6hdr_put(): to build the IPv6 header.
* nf_reject_ip6_tcphdr_put(): to build the TCP header.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

8bfcdf66

netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions · 052b9498

由 Pablo Neira Ayuso 提交于 10月 25, 2014

That can be reused by the reject bridge expression to build the reject
packet. The new functions are:

* nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header.
* nf_reject_iphdr_put(): to build the IPv4 header.
* nf_reject_ip_tcphdr_put(): to build the TCP header.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

052b9498

netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing · 4d87716c

由 Pablo Neira Ayuso 提交于 10月 25, 2014

Fixes: 36d2af59 ("netfilter: nf_tables: allow to filter from prerouting and postrouting")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4d87716c

drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets · 5188cd44

由 Ben Hutchings 提交于 10月 30, 2014

UFO is now disabled on all drivers that work with virtio net headers,
but userland may try to send UFO/IPv6 packets anyway.  Instead of
sending with ID=0, we should select identifiers on their behalf (as we
used to).
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Fixes: 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5188cd44

net: skb_fclone_busy() needs to detect orphaned skb · 39bb5e62

由 Eric Dumazet 提交于 10月 30, 2014

Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: 1f3279ae ("tcp: avoid retransmits of TCP packets hanging in host queues")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39bb5e62

gre: Use inner mac length when computing tunnel length · 14051f04

由 Tom Herbert 提交于 10月 30, 2014

Currently, skb_inner_network_header is used but this does not account
for Ethernet header for ETH_P_TEB. Use skb_inner_mac_header which
handles TEB and also should work with IP encapsulation in which case
inner mac and inner network headers are the same.

Tested: Ran TCP_STREAM over GRE, worked as expected.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14051f04

ipv4: Do not cache routing failures due to disabled forwarding. · fa19c2b0

由 Nicolas Cavallari 提交于 10月 30, 2014

If we cache them, the kernel will reuse them, independently of
whether forwarding is enabled or not.  Which means that if forwarding is
disabled on the input interface where the first routing request comes
from, then that unreachable result will be cached and reused for
other interfaces, even if forwarding is enabled on them.  The opposite
is also true.

This can be verified with two interfaces A and B and an output interface
C, where B has forwarding enabled, but not A and trying
ip route get $dst iif A from $src && ip route get $dst iif B from $src
Signed-off-by: NNicolas Cavallari <nicolas.cavallari@green-communications.fr>
Reviewed-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa19c2b0

30 10月, 2014 5 次提交

libceph: use memalloc flags for net IO · 89baaa57

由 Mike Christie 提交于 10月 16, 2014

This patch has ceph's lib code use the memalloc flags.

If the VM layer needs to write data out to free up memory to handle new
allocation requests, the block layer must be able to make forward progress.
To handle that requirement we use structs like mempools to reserve memory for
objects like bios and requests.

The problem is when we send/receive block layer requests over the network
layer, net skb allocations can fail and the system can lock up.
To solve this, the memalloc related flags were added. NBD, iSCSI
and NFS uses these flags to tell the network/vm layer that it should
use memory reserves to fullfill allcation requests for structs like
skbs.

I am running ceph in a bunch of VMs in my laptop, so this patch was
not tested very harshly.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>

89baaa57

inet: frags: remove the WARN_ON from inet_evict_bucket · d70127e8

由 Nikolay Aleksandrov 提交于 10月 28, 2014

The WARN_ON in inet_evict_bucket can be triggered by a valid case:
inet_frag_kill and inet_evict_bucket can be running in parallel on the
same queue which means that there has been at least one more ref added
by a previous inet_frag_find call, but inet_frag_kill can delete the
timer before inet_evict_bucket which will cause the WARN_ON() there to
trigger since we'll have refcnt!=1. Now, this case is valid because the
queue is being "killed" for some reason (removed from the chain list and
its timer deleted) so it will get destroyed in the end by one of the
inet_frag_put() calls which reaches 0 i.e. refcnt is still valid.

CC: Florian Westphal <fw@strlen.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McLean <chutzpah@gentoo.org>

Fixes: b13d3cbf ("inet: frag: move eviction of queues to work queue")
Reported-by: NPatrick McLean <chutzpah@gentoo.org>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d70127e8

inet: frags: fix a race between inet_evict_bucket and inet_frag_kill · 65ba1f1e

由 Nikolay Aleksandrov 提交于 10月 28, 2014

When the evictor is running it adds some chosen frags to a local list to
be evicted once the chain lock has been released but at the same time
the *frag_queue can be running for some of the same queues and it
may call inet_frag_kill which will wait on the chain lock and
will then delete the queue from the wrong list since it was added in the
eviction one. The fix is simple - check if the queue has the evict flag
set under the chain lock before deleting it, this is safe because the
evict flag is set only under that lock and having the flag set also means
that the queue has been detached from the chain list, so no need to delete
it again.
An important note to make is that we're safe w.r.t refcnt because
inet_frag_kill and inet_evict_bucket will sync on the del_timer operation
where only one of the two can succeed (or if the timer is executing -
none of them), the cases are:
1. inet_frag_kill succeeds in del_timer
 - then the timer ref is removed, but inet_evict_bucket will not add
   this queue to its expire list but will restart eviction in that chain
2. inet_evict_bucket succeeds in del_timer
 - then the timer ref is kept until the evictor "expires" the queue, but
   inet_frag_kill will remove the initial ref and will set
   INET_FRAG_COMPLETE which will make the frag_expire fn just to remove
   its ref.
In the end all of the queue users will do an inet_frag_put and the one
that reaches 0 will free it. The refcount balance should be okay.

CC: Florian Westphal <fw@strlen.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McLean <chutzpah@gentoo.org>

Fixes: b13d3cbf ("inet: frag: move eviction of queues to work queue")
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Reported-by: NPatrick McLean <chutzpah@gentoo.org>
Tested-by: NPatrick McLean <chutzpah@gentoo.org>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65ba1f1e

ipv6: notify userspace when we added or changed an ipv6 token · b2ed64a9

由 Lubomir Rintel 提交于 10月 27, 2014

NetworkManager might want to know that it changed when the router advertisement
arrives.
Signed-off-by: NLubomir Rintel <lkundrak@v3.sk>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2ed64a9

sch_pie: schedule the timer after all init succeed · d5610902

由 WANG Cong 提交于 10月 24, 2014

Cc: Vijay Subramanian <vijaynsu@cisco.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>

d5610902

29 10月, 2014 1 次提交

net: dsa: Error out on tagging protocol mismatches · ae439286

由 Andrew Lunn 提交于 10月 24, 2014

If there is a mismatch between enabled tagging protocols and the
protocol the switch supports, error out, rather than continue with a
situation which is unlikely to work.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
cc: alexander.h.duyck@intel.com
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae439286

28 10月, 2014 3 次提交

ipvs: Avoid null-pointer deref in debug code · 3d53666b

由 Alex Gartrell 提交于 10月 06, 2014

Use daddr instead of reaching into dest.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAlex Gartrell <agartrell@fb.com>
Signed-off-by: NSimon Horman <horms@verge.net.au>

3d53666b

bpf: split eBPF out of NET · f89b7755

由 Alexei Starovoitov 提交于 10月 23, 2014

introduce two configs:
- hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
  depend on
- visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use

that solves several problems:
- tracing and others that wish to use eBPF don't need to depend on NET.
  They can use BPF_SYSCALL to allow loading from userspace or select BPF
  to use it directly from kernel in NET-less configs.
- in 3.18 programs cannot be attached to events yet, so don't force it on
- when the rest of eBPF infra is there in 3.19+, it's still useful to
  switch it off to minimize kernel size

bloat-o-meter on x64 shows:
add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)

tested with many different config combinations. Hopefully didn't miss anything.
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f89b7755

netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops() · 7965ee93

由 Arturo Borrero 提交于 10月 26, 2014

The code looks for an already loaded target, and the correct list to search
is nft_target_list, not nft_match_list.
Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7965ee93

27 10月, 2014 1 次提交

net: napi_reuse_skb() should check pfmemalloc · 93a35f59

由 Eric Dumazet 提交于 10月 23, 2014

Do not reuse skb if it was pfmemalloc tainted, otherwise
future frame might be dropped anyway.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NRoman Gushchin <klamm@yandex-team.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93a35f59

26 10月, 2014 1 次提交

tcp: md5: do not use alloc_percpu() · 349ce993

由 Eric Dumazet 提交于 10月 23, 2014

percpu tcp_md5sig_pool contains memory blobs that ultimately
go through sg_set_buf().

-> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));

This requires that whole area is in a physically contiguous portion
of memory. And that @buf is not backed by vmalloc().

Given that alloc_percpu() can use vmalloc() areas, this does not
fit the requirements.

Replace alloc_percpu() by a static DEFINE_PER_CPU() as tcp_md5sig_pool
is small anyway, there is no gain to dynamically allocate it.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: 765cf997 ("tcp: md5: remove one indirection level in tcp_md5sig_pool")
Reported-by: NCrestez Dan Leonard <cdleonard@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

349ce993

24 10月, 2014 4 次提交

netfilter: nf_log: release skbuff on nlmsg put failure · b51d3fa3

由 Houcheng Lin 提交于 10月 23, 2014

The kernel should reserve enough room in the skb so that the DONE
message can always be appended.  However, in case of e.g. new attribute
erronously not being size-accounted for, __nfulnl_send() will still
try to put next nlmsg into this full skbuf, causing the skb to be stuck
forever and blocking delivery of further messages.

Fix issue by releasing skb immediately after nlmsg_put error and
WARN() so we can track down the cause of such size mismatch.

[ fw@strlen.de: add tailroom/len info to WARN ]
Signed-off-by: NHoucheng Lin <houcheng@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b51d3fa3

netfilter: nfnetlink_log: fix maximum packet length logged to userspace · c1e7dc91

由 Florian Westphal 提交于 10月 23, 2014

don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
The nla length includes the size of the nla struct, so anything larger
results in u16 integer overflow.

This patch is similar to
9cefbbc9 (netfilter: nfnetlink_queue: cleanup copy_range usage).
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c1e7dc91

netfilter: nf_log: account for size of NLMSG_DONE attribute · 9dfa1dfe

由 Florian Westphal 提交于 10月 23, 2014

We currently neither account for the nlattr size, nor do we consider
the size of the trailing NLMSG_DONE when allocating nlmsg skb.

This can result in nflog to stop working, as __nfulnl_send() re-tries
sending forever if it failed to append NLMSG_DONE (which will never
work if buffer is not large enough).
Reported-by: NHoucheng Lin <houcheng@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9dfa1dfe

bridge: Do not compile options in br_parse_ip_options · 7677e868

由 Herbert Xu 提交于 10月 04, 2014

Commit 462fb2af

	bridge : Sanitize skb before it enters the IP stack

broke when IP options are actually used because it mangles the
skb as if it entered the IP stack which is wrong because the
bridge is supposed to operate below the IP stack.

Since nobody has actually requested for parsing of IP options
this patch fixes it by simply reverting to the previous approach
of ignoring all IP options, i.e., zeroing the IPCB.

If and when somebody who uses IP options and actually needs them
to be parsed by the bridge complains then we can revisit this.
Reported-by: NDavid Newall <davidn@davidnewall.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Tested-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7677e868

23 10月, 2014 3 次提交

net: fix saving TX flow hash in sock for outgoing connections · 9e7ceb06

由 Sathya Perla 提交于 10月 22, 2014

The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.

But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.

This patch fixes this problem for IPv4/6 by invoking the the above routines
after the source port is available for the socket.

Fixes: b73c3d0e("net: Save TX flow hash in sock and set in skbuf on xmit")
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e7ceb06

xfrm6: fix a potential use after free in xfrm6_policy.c · 789f2023

由 Li RongQing 提交于 10月 22, 2014

pskb_may_pull() maybe change skb->data and make nh and exthdr pointer
oboslete, so recompute the nd and exthdr
Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

789f2023

net: tso: fix unaligned access to crafted TCP header in helper API · a63ba13e

由 Karl Beldan 提交于 10月 21, 2014

The crafted header start address is from a driver supplied buffer, which
one can reasonably expect to be aligned on a 4-bytes boundary.
However ATM the TSO helper API is only used by ethernet drivers and
the tcp header will then be aligned to a 2-bytes only boundary from the
header start address.
Signed-off-by: NKarl Beldan <karl.beldan@rivierawaves.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a63ba13e

22 10月, 2014 5 次提交

netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation · c123bb71

由 Sabrina Dubroca 提交于 10月 21, 2014

alloc_percpu returns NULL on failure, not a negative error code.

Fixes: ff3cd7b3 ("netfilter: nf_tables: refactor chain statistic routines")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c123bb71

netfilter: ipset: off by one in ip_set_nfnl_get_byindex() · 0f9f5e1b

由 Dan Carpenter 提交于 10月 21, 2014

The ->ip_set_list[] array is initialized in ip_set_net_init() and it
has ->ip_set_max elements so this check should be >= instead of >
otherwise we are off by one.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

0f9f5e1b

netfilter: nf_conntrack: allow server to become a client in TW handling · e37ad9fd

由 Marcelo Leitner 提交于 10月 13, 2014

When a port that was used to listen for inbound connections gets closed
and reused for outgoing connections (like rsh ends up doing for stderr
flow), current we may reject the SYN/ACK packet for the new connection
because tcp_conntracks states forbirds a port to become a client while
there is still a TIME_WAIT entry in there for it.

As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout
for it is 120s, there is a ~60s window that the application can end up
opening a port that conntrack will end up blocking.

This patch fixes this by simply allowing such state transition: if we
see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note
that the rest of the code already handles this situation, more
specificly in tcp_packet(), first switch clause.
Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

e37ad9fd

net: sched: initialize bstats syncp · 7c1c97d5

由 Sabrina Dubroca 提交于 10月 21, 2014

Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp.

Fixes: 22e0f8b9 "net: sched: make bstats per cpu and estimator RCU safe"
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Acked-by: NCong Wang <cwang@twopensource.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c1c97d5

netlink: Re-add locking to netlink_lookup() and seq walker · 78fd1d0a

由 Thomas Graf 提交于 10月 21, 2014

The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com>
Reported-and-tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78fd1d0a

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功