- 13 9月, 2012 1 次提交
-
-
由 David Howells 提交于
Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (*preparse)(struct key_preparsed_payload *prep); void (*free_preparse)(struct key_preparsed_payload *prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char *description; void *type_data[2]; void *payload; const void *data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (*instantiate)(struct key *key, struct key_preparsed_payload *prep); int (*update)(struct key *key, struct key_preparsed_payload *prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 06 9月, 2012 1 次提交
-
-
由 Mikulas Patocka 提交于
Commit 644595f8 ("compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c") introduced a bug where the helper functions to take either a 64-bit or compat time[spec|val] got the arguments in the wrong order, passing the kernel stack pointer off as a user pointer (and vice versa). Because of the user address range check, that in turn then causes an EFAULT due to the user pointer range checking failing for the kernel address. Incorrectly resuling in a failed system call for 32-bit processes with a 64-bit kernel. On odder architectures like HP-PA (with separate user/kernel address spaces), it can be used read kernel memory. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 9月, 2012 2 次提交
-
-
由 Julia Lawall 提交于
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julia Lawall 提交于
Initialize return variable before exiting on an error path. The initial initialization of the return variable is also dropped, because that value is never used. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 8月, 2012 5 次提交
-
-
由 Pablo Neira Ayuso 提交于
Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: NOliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Eric Dumazet 提交于
Following lockdep splat was reported by Pavel Roskin : [ 1570.586223] =============================== [ 1570.586225] [ INFO: suspicious RCU usage. ] [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted [ 1570.586229] ------------------------------- [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage! [ 1570.586233] [ 1570.586233] other info that might help us debug this: [ 1570.586233] [ 1570.586236] [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0 [ 1570.586238] 2 locks held by Chrome_IOThread/4467: [ 1570.586240] #0: (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0 [ 1570.586253] #1: (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270 [ 1570.586260] [ 1570.586260] stack backtrace: [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98 [ 1570.586265] Call Trace: [ 1570.586271] [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130 [ 1570.586275] [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270 [ 1570.586278] [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0 [ 1570.586282] [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90 [ 1570.586285] [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80 [ 1570.586290] [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0 [ 1570.586293] [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0 [ 1570.586296] [<ffffffff814f2c35>] release_sock+0x55/0xa0 [ 1570.586300] [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50 [ 1570.586305] [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230 [ 1570.586308] [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40 [ 1570.586312] [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0 [ 1570.586315] [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0 [ 1570.586320] [<ffffffff814ec435>] do_sock_write+0xc5/0xe0 [ 1570.586323] [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80 [ 1570.586328] [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0 [ 1570.586332] [<ffffffff8114c5a5>] vfs_write+0x165/0x180 [ 1570.586335] [<ffffffff8114c805>] sys_write+0x45/0x90 [ 1570.586340] [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NPavel Roskin <proski@gnu.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Francesco Ruggeri 提交于
When tearing down a net namespace, ipv4 mr_table structures are freed without first deactivating their timers. This can result in a crash in run_timer_softirq. This patch mimics the corresponding behaviour in ipv6. Locking and synchronization seem to be adequate. We are about to kfree mrt, so existing code should already make sure that no other references to mrt are pending or can be created by incoming traffic. The functions invoked here do not cause new references to mrt or other race conditions to be created. Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive. Both ipmr_expire_process (whose completion we may have to wait in del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock or other synchronizations when needed, and they both only modify mrt. Tested in Linux 3.4.8. Signed-off-by: NFrancesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 xeb@mail.ru 提交于
Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be atomic. Signed-off-by: NDmitry Kozlov <xeb@mail.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: NRafal Fitt <rafalf@aplusc.com.pl> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 30 8月, 2012 4 次提交
-
-
由 Julia Lawall 提交于
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julia Lawall 提交于
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julia Lawall 提交于
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr> Acked-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Amerigo Wang 提交于
Against -net. In the patch "netpoll: re-enable irq in poll_napi()", I tried to fix the following warning: [100718.051041] ------------[ cut here ]------------ [100718.051048] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted) [100718.051049] Hardware name: ProLiant BL460c G7 ... [100718.051068] Call Trace: [100718.051073] [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [100718.051075] [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 [100718.051077] [<ffffffff810747ed>] ? local_bh_enable_ip+0x7d/0xb0 [100718.051080] [<ffffffff8150041b>] ? _spin_unlock_bh+0x1b/0x20 [100718.051085] [<ffffffffa00ee974>] ? be_process_mcc+0x74/0x230 [be2net] [100718.051088] [<ffffffffa00ea68c>] ? be_poll_tx_mcc+0x16c/0x290 [be2net] [100718.051090] [<ffffffff8144fe76>] ? netpoll_poll_dev+0xd6/0x490 [100718.051095] [<ffffffffa01d24a5>] ? bond_poll_controller+0x75/0x80 [bonding] [100718.051097] [<ffffffff8144fde5>] ? netpoll_poll_dev+0x45/0x490 [100718.051100] [<ffffffff81161b19>] ? ksize+0x19/0x80 [100718.051102] [<ffffffff81450437>] ? netpoll_send_skb_on_dev+0x157/0x240 by reenabling IRQ before calling ->poll, but it seems more problems are introduced after that patch: http://ozlabs.org/~akpm/stuff/IMG_20120824_122054.jpg http://marc.info/?l=linux-netdev&m=134563282530588&w=2 So it is safe to fix be2net driver code directly. This patch reverts the offending commit and fixes be_poll() by avoid disabling BH there, this is okay because be_poll() can be called either by poll_napi() which already disables IRQ, or by net_rx_action() which already disables BH. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Reported-by: NSylvain Munaut <s.munaut@whatever-company.com> Cc: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: NCong Wang <amwang@redhat.com> Tested-by: NSylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2012 2 次提交
-
-
由 Yuchung Cheng 提交于
The cwnd reduction in fast recovery is based on the number of packets newly delivered per ACK. For non-sack connections every DUPACK signifies a packet has been delivered, but the sender mistakenly skips counting them for cwnd reduction. The fix is to compute newly_acked_sacked after DUPACKs are accounted in sacked_out for non-sack connections. Signed-off-by: NYuchung Cheng <ycheng@google.com> Acked-by: NNandita Dukkipati <nanditad@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
Non-root user-space processes can send Netlink messages to other processes that are well-known for being subscribed to Netlink asynchronous notifications. This allows ilegitimate non-root process to send forged messages to Netlink subscribers. The userspace process usually verifies the legitimate origin in two ways: a) Socket credentials. If UID != 0, then the message comes from some ilegitimate process and the message needs to be dropped. b) Netlink portID. In general, portID == 0 means that the origin of the messages comes from the kernel. Thus, discarding any message not coming from the kernel. However, ctnetlink sets the portID in event messages that has been triggered by some user-space process, eg. conntrack utility. So other processes subscribed to ctnetlink events, eg. conntrackd, know that the event was triggered by some user-space action. Neither of the two ways to discard ilegitimate messages coming from non-root processes can help for ctnetlink. This patch adds capability validation in case that dst_pid is set in netlink_sendmsg(). This approach is aggressive since existing applications using any Netlink bus to deliver messages between two user-space processes will break. Note that the exception is NETLINK_USERSOCK, since it is reserved for netlink-to-netlink userspace communication. Still, if anyone wants that his Netlink bus allows netlink-to-netlink userspace, then they can set NL_NONROOT_SEND. However, by default, I don't think it makes sense to allow to use NETLINK_ROUTE to communicate two processes that are sending no matter what information that is not related to link/neighbouring/routing. They should be using NETLINK_USERSOCK instead for that. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2012 2 次提交
-
-
由 Eric Dumazet 提交于
Multicast traffic allocates dst with DST_NOCACHE, but dst is not inserted into rt_uncached_list. This slowdown multicast workloads on SMP because rt_uncached_lock is contended. Change the test before taking the lock to actually check the dst was inserted into rt_uncached_list. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fengguang Wu 提交于
cc: Eric Leblond <eric@regit.org> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
Sylvain Munault reported following info : - TCP connection get "stuck" with data in send queue when doing "large" transfers ( like typing 'ps ax' on a ssh connection ) - Only happens on path where the PMTU is lower than the MTU of the interface - Is not present right after boot, it only appears 10-20min after boot or so. (and that's inside the _same_ TCP connection, it works fine at first and then in the same ssh session, it'll get stuck) - Definitely seems related to fragments somehow since I see a router sending ICMP message saying fragmentation is needed. - Exact same setup works fine with kernel 3.5.1 Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration period is over. ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration, but dst_set_expires() does nothing because dst.expires is already set. It seems we want to set the expires field to a new value, regardless of prior one. With help from Julian Anastasov. Reported-by: NSylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: NEric Dumazet <edumazet@google.com> CC: Julian Anastasov <ja@ssi.bg> Tested-by: NSylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 8月, 2012 6 次提交
-
-
由 Thomas Pedersen 提交于
The destination address of unicast frames forwarded through a mesh gate was being replaced with the broadcast address. Instead leave the original destination address as the mesh DA. If the nexthop address is not in the mpath table it will be resolved. If that fails, the frame will be forwarded to known mesh gates. Reported-by: NCedric Voncken <cedric.voncken@acksys.fr> Signed-off-by: NThomas Pedersen <thomas@cozybit.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Jim Schutt 提交于
Because the Ceph client messenger uses a non-blocking connect, it is possible for the sending of the client banner to race with the arrival of the banner sent by the peer. When ceph_sock_state_change() notices the connect has completed, it schedules work to process the socket via con_work(). During this time the peer is writing its banner, and arrival of the peer banner races with con_work(). If con_work() calls try_read() before the peer banner arrives, there is nothing for it to do, after which con_work() calls try_write() to send the client's banner. In this case Ceph's protocol negotiation can complete succesfully. The server-side messenger immediately sends its banner and addresses after accepting a connect request, *before* actually attempting to read or verify the banner from the client. As a result, it is possible for the banner from the server to arrive before con_work() calls try_read(). If that happens, try_read() will read the banner and prepare protocol negotiation info via prepare_write_connect(). prepare_write_connect() calls con_out_kvec_reset(), which discards the as-yet-unsent client banner. Next, con_work() calls try_write(), which sends the protocol negotiation info rather than the banner that the peer is expecting. The result is that the peer sees an invalid banner, and the client reports "negotiation failed". Fix this by moving con_out_kvec_reset() out of prepare_write_connect() to its callers at all locations except the one where the banner might still need to be sent. [elder@inktak.com: added note about server-side behavior] Signed-off-by: NJim Schutt <jaschut@sandia.gov> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Eric Dumazet 提交于
Pablo Neira Ayuso discovered that avahi and potentially NetworkManager accept spoofed Netlink messages because of a kernel bug. The kernel passes all-zero SCM_CREDENTIALS ancillary data to the receiver if the sender did not provide such data, instead of not including any such data at all or including the correct data from the peer (as it is the case with AF_UNIX). This bug was introduced in commit 16e57262 (af_unix: dont send SCM_CREDENTIALS by default) This patch forces passing credentials for netlink, as before the regression. Another fix would be to not add SCM_CREDENTIALS in netlink messages if not provided by the sender, but it might break some programs. With help from Florian Weimer & Petr Matousek This issue is designated as CVE-2012-3520 Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Christian Casteyde reported a kmemcheck 32-bit read from uninitialized memory in __ip_select_ident(). It turns out that __ip_make_skb() called ip_select_ident() before properly initializing iph->daddr. This is a bug uncovered by commit 1d861aa4 (inet: Minimize use of cached route inetpeer.) Addresses https://bugzilla.kernel.org/show_bug.cgi?id=46131Reported-by: NChristian Casteyde <casteyde.christian@free.fr> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Paasch 提交于
Since 0e734419 ("ipv4: Use inet_csk_route_child_sock() in DCCP and TCP."), inet_csk_route_child_sock() is called instead of inet_csk_route_req(). However, after creating the child-sock in tcp/dccp_v4_syn_recv_sock(), ireq->opt is set to NULL, before calling inet_csk_route_child_sock(). Thus, inside inet_csk_route_child_sock() opt is always NULL and the SRR-options are not respected anymore. Packets sent by the server won't have the correct destination-IP. This patch fixes it by accessing newinet->inet_opt instead of ireq->opt inside inet_csk_route_child_sock(). Reported-by: NLuca Boccassi <luca.boccassi@gmail.com> Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Commit 6f458dfb (tcp: improve latencies of timer triggered events) added bug leading to following trace : [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.131726] [ 2866.132188] ========================= [ 2866.132281] [ BUG: held lock freed! ] [ 2866.132281] 3.6.0-rc1+ #622 Not tainted [ 2866.132281] ------------------------- [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there! [ 2866.132281] (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] 4 locks held by kworker/0:1/652: [ 2866.132281] #0: (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #1: ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] #3: (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f [ 2866.132281] [ 2866.132281] stack backtrace: [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622 [ 2866.132281] Call Trace: [ 2866.132281] <IRQ> [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159 [ 2866.132281] [<ffffffff818a0839>] ? __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a [ 2866.132281] [<ffffffff818a0839>] __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff818a08c0>] sk_free+0x1c/0x1e [ 2866.132281] [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56 [ 2866.132281] [<ffffffff81078082>] run_timer_softirq+0x218/0x35f [ 2866.132281] [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f [ 2866.132281] [<ffffffff810f5831>] ? rb_commit+0x58/0x85 [ 2866.132281] [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148 [ 2866.132281] [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9 [ 2866.132281] [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e [ 2866.132281] [<ffffffff81a1227c>] call_softirq+0x1c/0x30 [ 2866.132281] [<ffffffff81039f38>] do_softirq+0x4a/0xa6 [ 2866.132281] [<ffffffff81070f2b>] irq_exit+0x51/0xad [ 2866.132281] [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4 [ 2866.132281] [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f [ 2866.132281] <EOI> [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1 [ 2866.132281] [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56 [ 2866.132281] [<ffffffff81078692>] mod_timer+0x178/0x1a9 [ 2866.132281] [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26 [ 2866.132281] [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4 [ 2866.132281] [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70 [ 2866.132281] [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4 [ 2866.132281] [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1 [ 2866.132281] [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a [ 2866.132281] [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6 [ 2866.132281] [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5 [ 2866.132281] [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4 [ 2866.132281] [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5 [ 2866.132281] [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f [ 2866.132281] [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43 [ 2866.132281] [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80 [ 2866.132281] [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0 [ 2866.132281] [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61 [ 2866.132281] [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1 [ 2866.132281] [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff81999d92>] call_transmit+0x1c5/0x20e [ 2866.132281] [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225 [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34 [ 2866.132281] [<ffffffff810835d6>] process_one_work+0x24d/0x47f [ 2866.132281] [<ffffffff81083567>] ? process_one_work+0x1de/0x47f [ 2866.132281] [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225 [ 2866.132281] [<ffffffff81083a6d>] worker_thread+0x236/0x317 [ 2866.132281] [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f [ 2866.132281] [<ffffffff8108b7b8>] kthread+0x9a/0xa2 [ 2866.132281] [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10 [ 2866.132281] [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13 [ 2866.132281] [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a [ 2866.132281] [<ffffffff81a12180>] ? gs_change+0x13/0x13 [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.309689] ============================================================================= [ 2866.310254] BUG TCP (Not tainted): Object already free [ 2866.310254] ----------------------------------------------------------------------------- [ 2866.310254] The bug comes from the fact that timer set in sk_reset_timer() can run before we actually do the sock_hold(). socket refcount reaches zero and we free the socket too soon. timer handler is not allowed to reduce socket refcnt if socket is owned by the user, or we need to change sk_reset_timer() implementation. We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED was used instead of TCP_DELACK_TIMER_DEFERRED. For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED, even if not fired from a timer. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Tested-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 8月, 2012 4 次提交
-
-
由 J. Bruce Fields 提交于
The rpc server tries to ensure that there will be room to send a reply before it receives a request. It does this by tracking, in xpt_reserved, an upper bound on the total size of the replies that is has already committed to for the socket. Currently it is adding in the estimate for a new reply *before* it checks whether there is space available. If it finds that there is not space, it then subtracts the estimate back out. This may lead the subsequent svc_xprt_enqueue to decide that there is space after all. The results is a svc_recv() that will repeatedly return -EAGAIN, causing server threads to loop without doing any actual work. Cc: stable@vger.kernel.org Reported-by: NMichael Tokarev <mjt@tls.msk.ru> Tested-by: NMichael Tokarev <mjt@tls.msk.ru> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply. However, the XPT_CLOSE won't be acted on immediately. Meanwhile other threads could send further replies before the socket is really shut down. This can manifest as data corruption: for example, if a truncated read reply is followed by another rpc reply, that second reply will look to the client like further read data. Symptoms were data corruption preceded by svc_tcp_sendto logging something like kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket Cc: stable@vger.kernel.org Reported-by: NMalahal Naineni <malahal@us.ibm.com> Tested-by: NMalahal Naineni <malahal@us.ibm.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if sk_tcplen would lead us to expect n pages of data). svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen. This is OK, since both functions are serialized by XPT_BUSY. However, that means the inconsistency must be repaired before dropping XPT_BUSY. Therefore we should be ensuring that svc_tcp_save_pages repairs the problem before exiting svc_tcp_recv_record on error. Symptoms were a BUG() in svc_tcp_clear_pages. Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Sage Weil 提交于
The debugfs directory includes the cluster fsid and our unique global_id. We need to delay the initialization of the debug entry until we have learned both the fsid and our global_id from the monitor or else the second client can't create its debugfs entry and will fail (and multiple client instances aren't properly reflected in debugfs). Reported by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>
-
- 20 8月, 2012 6 次提交
-
-
由 Patrick McHardy 提交于
Commit 1db20a52 (nfnetlink_log: Stop using NLA_PUT*().) incorrectly converted a NLA_PUT_BE16 macro to nla_put_be32() in nfnetlink_log: - NLA_PUT_BE16(inst->skb, NFULA_HWTYPE, htons(skb->dev->type)); + if (nla_put_be32(inst->skb, NFULA_HWTYPE, htons(skb->dev->type)) Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Neal Cardwell 提交于
This commit removes the sk_rx_dst_set calls from tcp_create_openreq_child(), because at that point the icsk_af_ops field of ipv6_mapped TCP sockets has not been set to its proper final value. Instead, to make sure we get the right sk_rx_dst_set variant appropriate for the address family of the new connection, we have tcp_v{4,6}_syn_recv_sock() directly call the appropriate function shortly after the call to tcp_create_openreq_child() returns. This also moves inet6_sk_rx_dst_set() to avoid a forward declaration with the new approach. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Reported-by: NArtem Savkov <artem.savkov@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Fix kernel-doc warning: Warning(net/core/dev.c:5745): No description found for parameter 'dev' Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Commit 97bab73f (inet: Hide route peer accesses behind helpers.) introduced a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not initialized, causing a false positive result from inetpeer_ptr_is_peer(), which in turn causes a NULL pointer dereference in inet_putpeer(). Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0 EIP is at inet_putpeer+0xe/0x16 EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641 ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4 DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28 f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4 f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8 [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb [<c038d5f7>] dst_destroy+0x1d/0xa4 [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36 [<c0396870>] flow_cache_gc_task+0x85/0x9f [<c0142d2b>] process_one_work+0x122/0x441 [<c043feb5>] ? apic_timer_interrupt+0x31/0x38 [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b [<c0143e2d>] worker_thread+0x113/0x3cc Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to properly initialize the dst's peer pointer. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Juhl 提交于
In net/caif/chnl_net.c::chnl_recv_cb() we call skb_header_pointer() which may return NULL, but we do not check for a NULL pointer before dereferencing it. This patch adds such a NULL check and properly free's allocated memory and return an error (-EINVAL) on failure - much better than crashing.. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Acked-by: NSjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Leblond 提交于
If a packet is emitted on one socket in one group of fanout sockets, it is transmitted again. It is thus read again on one of the sockets of the fanout group. This result in a loop for software which generate packets when receiving one. This retransmission is not the intended behavior: a fanout group must behave like a single socket. The packet should not be transmitted on a socket if it originates from a socket belonging to the same fanout group. This patch fixes the issue by changing the transmission check to take fanout group info account. Reported-by: NAleksandr Kotov <a1k@mail.ru> Signed-off-by: NEric Leblond <eric@regit.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 8月, 2012 4 次提交
-
-
由 John Fastabend 提交于
A race exists where creating cgroups and also updating the priomap may result in losing a priomap update. This is because priomap writers are not protected by rtnl_lock. Move priority writer into rtnl_lock()/rtnl_unlock(). CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
A socket fd passed in a SCM_RIGHTS datagram was not getting updated with the new tasks cgrp prioidx. This leaves IO on the socket tagged with the old tasks priority. To fix this add a check in the scm recvmsg path to update the sock cgrp prioidx with the new tasks value. Thanks to Al Viro for catching this. CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
Add lock to prevent a race with a file closing and also remove useless and ugly sscanf code. The extra code was never needed and the case it supposedly protected against is in fact handled correctly by sock_from_file as pointed out by Al Viro. CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
We drop packet unconditionally when we fail to mirror it. This is not intended in some cases. Consdier for kvm guest, we may mirror the traffic of the bridge to a tap device used by a VM. When kernel fails to mirror the packet in conditions such as when qemu crashes or stop polling the tap, it's hard for the management software to detect such condition and clean the the mirroring before. This would lead all packets to the bridge to be dropped and break the netowrk of other virtual machines. To solve the issue, the patch does not drop packets when kernel fails to mirror it, and only drop the redirected packets. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 8月, 2012 2 次提交
-
-
由 Pablo Neira Ayuso 提交于
In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2Reported-by: NRafal Fitt <rafalf@aplusc.com.pl> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Mathias Krause 提交于
The implementation of dev_ifconf() for the compat ioctl interface uses an intermediate ifc structure allocated in userland for the duration of the syscall. Though, it fails to initialize the padding bytes inserted for alignment and that for leaks four bytes of kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: NMathias Krause <minipli@googlemail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-