- 31 3月, 2015 13 次提交
-
-
由 Chuck Lever 提交于
Allow each memory registration mode to plug in a callout that handles the completion of a memory registration operation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The open op determines the size of various transport data structures based on device capabilities and memory registration mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Memory Region objects associated with a transport instance are destroyed before the instance is shutdown and destroyed. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
This method is invoked when a transport instance is about to be reconnected. Each Memory Region object is reset to its initial state. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
This method is used when setting up a new transport instance to create a pool of Memory Region objects that will be used to register memory during operation. Memory Regions are not needed for "physical" registration, since ->prepare and ->release are no-ops for that mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
There is very little common processing among the different external memory deregistration functions. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
There is very little common processing among the different external memory registration functions. Have rpcrdma_create_chunks() call the registration method directly. This removes a stack frame and a switch statement from the external registration path. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The max_payload computation is generalized to ensure that the payload maximum is the lesser of RPC_MAX_DATA_SEGS and the number of data segments that can be transmitted in an inline buffer. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Instead of employing switch() statements, let's use the typical Linux kernel idiom for handling behavioral variation: virtual functions. Start by defining a vector of operations for each supported memory registration mode, and by adding a source file for each mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
If a provider advertizes a zero max_fast_reg_page_list_len, FRWR depth detection loops forever. Instead of just failing the mount, try other memory registration modes. Fixes: 0fc6c4e7 ("xprtrdma: mind the device's max fast . . .") Reported-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The RPC/RDMA transport's FRWR registration logic registers whole pages. This means areas in the first and last pages that are not involved in the RDMA I/O are needlessly exposed to the server. Buffered I/O is typically page-aligned, so not a problem there. But for direct I/O, which can be byte-aligned, and for reply chunks, which are nearly always smaller than a page, the transport could expose memory outside the I/O buffer. FRWR allows byte-aligned memory registration, so let's use it as it was intended. Reported-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Commit 6ab59945 ("xprtrdma: Update rkeys after transport reconnect" added logic in the ->send_request path to update the chunk list when an RPC/RDMA request is retransmitted. Note that rpc_xdr_encode() resets and re-encodes the entire RPC send buffer for each retransmit of an RPC. The RPC send buffer is not preserved from the previous transmission of an RPC. Revert 6ab59945, and instead, just force each request to be fully marshaled every time through ->send_request. This should preserve the fix from 6ab59945, while also performing pullup during retransmits. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 23 3月, 2015 1 次提交
-
-
由 Pablo Neira Ayuso 提交于
ip6tables extensions check for this flag to restrict match/target to a given protocol. Without this flag set, SYNPROXY6 returns an error. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Acked-by: NPatrick McHardy <kaber@trash.net>
-
- 21 3月, 2015 5 次提交
-
-
由 Al Viro 提交于
Cc: stable@vger.kernel.org # v3.19 Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Catalin Marinas 提交于
Commit db31c55a (net: clamp ->msg_namelen instead of returning an error) introduced the clamping of msg_namelen when the unsigned value was larger than sizeof(struct sockaddr_storage). This caused a msg_namelen of -1 to be valid. The native code was subsequently fixed by commit dbb490b9 (net: socket: error on a negative msg_namelen). In addition, the native code sets msg_namelen to 0 when msg_name is NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland) and subsequently updated by 08adb7da (fold verify_iovec() into copy_msghdr_from_user()). This patch brings the get_compat_msghdr() in line with copy_msghdr_from_user(). Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Josh Hunt 提交于
tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn | grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steven Barth 提交于
for throw routes to trigger evaluation of other policy rules EAGAIN needs to be propagated up to fib_rules_lookup similar to how its done for IPv4 A simple testcase for verification is: ip -6 rule add lookup 33333 priority 33333 ip -6 route add throw 2001:db8::1 ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333 ip route get 2001:db8::1 Signed-off-by: NSteven Barth <cyrus@openwrt.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
Matt Grant reported frequent crashes in ipv6_select_ident when udp6_ufo_fragment is called from openvswitch on a skb that doesn't have a dst_entry set. ipv6_proxy_select_ident generates the frag_id without using the dst associated with the skb. This approach was suggested by Vladislav Yasevich. Fixes: 0508c07f ("ipv6: Select fragment id during UFO segmentation if not set.") Cc: Vladislav Yasevich <vyasevic@redhat.com> Reported-by: NMatt Grant <matt@mattgrant.net.nz> Tested-by: NMatt Grant <matt@mattgrant.net.nz> Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Acked-by: NVladislav Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 3月, 2015 1 次提交
-
-
由 Pablo Neira Ayuso 提交于
We have to check for IP6T_INV_PROTO in invflags, instead of flags. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Acked-by: NBalazs Scheidler <bazsi@balabit.hu>
-
- 19 3月, 2015 1 次提交
-
-
由 Pablo Neira Ayuso 提交于
Since fab4085f ("netfilter: log: nf_log_packet() as real unified interface"), the loginfo structure that is passed to nf_log_packet() is used to explicitly indicate the logger type you want to use. This is a problem for people tracing rules through nfnetlink_log since packets are always routed to the NF_LOG_TYPE logger after the aforementioned patch. We can fix this by removing the trace loginfo structures, but that still changes the log level from 4 to 5 for tracing messages and there may be someone relying on this outthere. So let's just introduce a new nf_log_trace() function that restores the former behaviour. Reported-by: NMarkus Kötter <koetter@rrzn.uni-hannover.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 18 3月, 2015 3 次提交
-
-
由 Daniel Borkmann 提交于
Revisiting commit d23b8ad8 ("tc: add BPF based action") with regards to eBPF support, I was thinking that it might be better to improve return semantics from a BPF program invoked through BPF_PROG_RUN(). Currently, in case filter_res is 0, we overwrite the default action opcode with TC_ACT_SHOT. A default action opcode configured through tc's m_bpf can be: TC_ACT_RECLASSIFY, TC_ACT_PIPE, TC_ACT_SHOT, TC_ACT_UNSPEC, TC_ACT_OK. In cls_bpf, we have the possibility to overwrite the default class associated with the classifier in case filter_res is _not_ 0xffffffff (-1). That allows us to fold multiple [e]BPF programs into a single one, where they would otherwise need to be defined as a separate classifier with its own classid, needlessly redoing parsing work, etc. Similarly, we could do better in act_bpf: Since above TC_ACT* opcodes are exported to UAPI anyway, we reuse them for return-code-to-tc-opcode mapping, where we would allow above possibilities. Thus, like in cls_bpf, a filter_res of 0xffffffff (-1) means that the configured _default_ action is used. Any unkown return code from the BPF program would fail in tcf_bpf() with TC_ACT_UNSPEC. Should we one day want to make use of TC_ACT_STOLEN or TC_ACT_QUEUED, which both have the same semantics, we have the option to either use that as a default action (filter_res of 0xffffffff) or non-default BPF return code. All that will allow us to transparently use tcf_bpf() for both BPF flavours. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
I got the following trace with current net-next kernel : [14723.885290] WARNING: CPU: 26 PID: 22658 at kernel/sched/core.c:7285 __might_sleep+0x89/0xa0() [14723.885325] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810e8734>] prepare_to_wait_exclusive+0x34/0xa0 [14723.885355] CPU: 26 PID: 22658 Comm: netserver Not tainted 4.0.0-dbg-DEV #1379 [14723.885359] ffffffff81a223a8 ffff881fae9e7ca8 ffffffff81650b5d 0000000000000001 [14723.885364] ffff881fae9e7cf8 ffff881fae9e7ce8 ffffffff810a72e7 0000000000000000 [14723.885367] ffffffff81a57620 000000000000093a 0000000000000000 ffff881fae9e7e64 [14723.885371] Call Trace: [14723.885377] [<ffffffff81650b5d>] dump_stack+0x4c/0x65 [14723.885382] [<ffffffff810a72e7>] warn_slowpath_common+0x97/0xe0 [14723.885386] [<ffffffff810a73e6>] warn_slowpath_fmt+0x46/0x50 [14723.885390] [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0 [14723.885393] [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0 [14723.885396] [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0 [14723.885399] [<ffffffff810ccdc9>] __might_sleep+0x89/0xa0 [14723.885403] [<ffffffff81581846>] lock_sock_nested+0x36/0xb0 [14723.885406] [<ffffffff815829a3>] ? release_sock+0x173/0x1c0 [14723.885411] [<ffffffff815ea1f7>] inet_csk_accept+0x157/0x2a0 [14723.885415] [<ffffffff810e8900>] ? abort_exclusive_wait+0xc0/0xc0 [14723.885419] [<ffffffff8161b96d>] inet_accept+0x2d/0x150 [14723.885424] [<ffffffff8157db6f>] SYSC_accept4+0xff/0x210 [14723.885428] [<ffffffff8165a451>] ? retint_swapgs+0xe/0x44 [14723.885431] [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0 [14723.885437] [<ffffffff81369c0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [14723.885441] [<ffffffff8157ef40>] SyS_accept+0x10/0x20 [14723.885444] [<ffffffff81659872>] system_call_fastpath+0x12/0x17 [14723.885447] ---[ end trace ff74cd83355b1873 ]--- In commit 26cabd31 Peter added a sched_annotate_sleep() in sk_wait_event() Is the following patch needed as well ? Alternative would be to use sk_wait_event() from inet_csk_wait_for_connect() Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
After commit 2b0bb01b, the kernel returns -ENOBUFS when user tries to add an existing tunnel with ioctl API: $ ip -6 tunnel add ip6tnl1 mode ip6ip6 dev eth1 add tunnel "ip6tnl0" failed: No buffer space available It's confusing, the right error is EEXIST. This patch also change a bit the code returned: - ENOBUFS -> ENOMEM - ENOENT -> ENODEV Fixes: 2b0bb01b ("ip6_tunnel: Return an error when adding an existing tunnel.") CC: Steffen Klassert <steffen.klassert@secunet.com> Reported-by: NPierre Cheynier <me@pierre-cheynier.net> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2015 1 次提交
-
-
由 Pablo Neira Ayuso 提交于
If there's an existing base chain, we have to allow to change the default policy without indicating the hook information. However, if the chain doesn't exists, we have to enforce the presence of the hook attribute. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 16 3月, 2015 6 次提交
-
-
由 Johannes Berg 提交于
If the AP is confused and starts doing a CSA to the same channel, just ignore that request instead of trying to act it out since it was likely sent in error anyway. In the case of the bug I was investigating the GO was misbehaving and sending out a beacon with CSA IEs still included after having actually done the channel switch. Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
As HT/VHT depend heavily on QoS/WMM, it's not a good idea to let userspace add clients that have HT/VHT but not QoS/WMM. Since it does so in certain cases we've observed (client is using HT IEs but not QoS/WMM) just ignore the HT/VHT info at this point and don't pass it down to the drivers which might unconditionally use it. Cc: stable@vger.kernel.org Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
When a beacon from the AP contains only the ECSA IE, and not a CSA IE as well, this ECSA IE is not considered for calculating the CRC and the beacon might be dropped as not being interesting. This is clearly wrong, it should be handled and the channel switch should be executed. Fix this by including the ECSA IE ID in the bitmap of interesting IEs. Reported-by: NGil Tribush <gil.tribush@intel.com> Reviewed-by: NLuciano Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Andrei Otcheretianski 提交于
Since moving the interface combination checks to mac80211, it's broken because it now only considers interfaces with an assigned channel context, so for example any interface that isn't active can still be up, which is clearly an issue; also, in particular P2P-Device wdevs are an issue since they never have a chanctx. Fix this by counting running interfaces instead the ones with a channel context assigned. Cc: stable@vger.kernel.org [3.16+] Fixes: 73de86a3 ("cfg80211/mac80211: move interface counting for combination check to mac80211") Signed-off-by: NAndrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com> [rewrite commit message, dig out the commit it fixes] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Al Viro 提交于
[I would really like an ACK on that one from dhowells; it appears to be quite straightforward, but...] MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit in there. It gets passed via flags; in fact, another such check in the same function is done correctly - as flags & MSG_PEEK. It had been that way (effectively disabled) for 8 years, though, so the patch needs beating up - that case had never been tested. If it is correct, it's -stable fodder. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Al Viro 提交于
It should be checking flags, not msg->msg_flags. It's ->sendmsg() instances that need to look for that in ->msg_flags, ->recvmsg() ones (including the other ->recvmsg() instance in that file, as well as unix_dgram_recvmsg() this one claims to be imitating) check in flags. Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC in receive") back in 2010, so it goes quite a while back. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 3月, 2015 1 次提交
-
-
由 Venkat Venkatsubra 提交于
On adding an interface br_add_if() sets the MTU to the min of all the interfaces. Do the same thing on removing an interface too in br_del_if. Signed-off-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 3月, 2015 1 次提交
-
-
由 Eric Dumazet 提交于
inet_diag_dump_one_icsk() allocates too small skb. Add inet_sk_attr_size() helper right before inet_sk_diag_fill() so that it can be updated if/when new attributes are added. iproute2/ss currently does not use this dump_one() interface, this might explain nobody noticed this problem yet. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 3月, 2015 2 次提交
-
-
由 Herbert Xu 提交于
When we get back an EAGAIN from rhashtable_walk_next we were treating it as a valid object which obviously doesn't work too well. Luckily this is hard to trigger so it seems nobody has run into it yet. This patch fixes it by redoing the next call when we get an EAGAIN. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Michael S. Tsirkin 提交于
On device hot-unplug, 9p/virtio currently will kfree channel while it might still be in use. Of course, it might stay used forever, so it's an extremely ugly hack, but it seems better than use-after-free that we have now. [ Unused variable removed, whitespace cleanup, msg single-lined --RR ] Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 12 3月, 2015 5 次提交
-
-
由 Ian Wilson 提交于
nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(), nfnl_cthelper_get() and nfnl_cthelper_del(). In each case they pass a pointer to an nf_conntrack_tuple data structure local variable: struct nf_conntrack_tuple tuple; ... ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]); The problem is that this local variable is not initialized, and nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and dst.protonum. This leaves all other fields with undefined values based on whatever is on the stack: tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM])); tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]); The symptom observed was that when the rpc and tns helpers were added then traffic to port 1536 was being sent to user-space. Signed-off-by: NIan Wilson <iwilson@brocade.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Arnd Bergmann 提交于
The rds_iw_update_cm_id function stores a large 'struct rds_sock' object on the stack in order to pass a pair of addresses. This happens to just fit withint the 1024 byte stack size warning limit on x86, but just exceed that limit on ARM, which gives us this warning: net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=] As the use of this large variable is basically bogus, we can rearrange the code to not do that. Instead of passing an rds socket into rds_iw_get_device, we now just pass the two addresses that we have available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly, to create two address structures on the stack there. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Test that sk != NULL before reading sk->sk_tsflags. Fixes: 49ca0d8b ("net-timestamp: no-payload option") Reported-by: NOne Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
John reported that my previous commit added a regression on his router. This is because sender_cpu & napi_id share a common location, so get_xps_queue() can see garbage and perform an out of bound access. We need to make sure sender_cpu is cleared before doing the transmit, otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger this bug. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NJohn <jw@nuclearfallout.net> Bisected-by: NJohn <jw@nuclearfallout.net> Fixes: 2bd82484 ("xps: fix xps for stacked devices") Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Kodanev 提交于
sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be set to incorrect values. Given that 'struct sk_buff' allocates from rcvbuf, incorrectly set buffer length could result to memory allocation failures. For example, set them as follows: # sysctl net.core.rmem_default=64 net.core.wmem_default = 64 # sysctl net.core.wmem_default=64 net.core.wmem_default = 64 # ping localhost -s 1024 -i 0 > /dev/null This could result to the following failure: skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32 head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL> kernel BUG at net/core/skbuff.c:102! invalid opcode: 0000 [#1] SMP ... task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000 RIP: 0010:[<ffffffff8155fbd1>] [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP: 0018:ffff88003ae8bc68 EFLAGS: 00010296 RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000 RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8 RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900 FS: 00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0 Stack: ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550 ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00 Call Trace: [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470 [<ffffffff81556f56>] sock_write_iter+0x146/0x160 [<ffffffff811d9612>] new_sync_write+0x92/0xd0 [<ffffffff811d9cd6>] vfs_write+0xd6/0x180 [<ffffffff811da499>] SyS_write+0x59/0xd0 [<ffffffff81651532>] system_call_fastpath+0x12/0x17 Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 RIP [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP <ffff88003ae8bc68> Kernel panic - not syncing: Fatal exception Moreover, the possible minimum is 1, so we can get another kernel panic: ... BUG: unable to handle kernel paging request at ffff88013caee5c0 IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0 ... Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-