- 14 3月, 2016 1 次提交
-
-
由 liping.zhang 提交于
There is no need to use the static variable here, pr_info_once is more concise. Signed-off-by: NLiping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2016 3 次提交
-
-
由 Tom Herbert 提交于
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to indicate that more messages will follow (i.e. a batch of messages is being sent). This is similar to MSG_MORE except that the following messages are not merged into one packet, they are sent individually. sendmmsg is updated so that each contained message except for the last one is marked as MSG_BATCH. MSG_BATCH is a performance optimization in cases where a socket implementation can benefit by transmitting packets in a batch. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch allows setting MSG_EOR in each individual msghdr passed in sendmmsg. This allows a sendmmsg to send multiple messages when using SOCK_SEQPACKET. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Export it for cases where we want to create sockets by hand. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 1月, 2016 1 次提交
-
-
由 Vladimir Davydov 提交于
Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 1月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
Applications often have to reduce number of datagrams they receive or send per system call to avoid starvation problems. Really the kernel should take care of this by using cond_resched(), so that applications can experiment bigger batch sizes. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 12月, 2015 1 次提交
-
-
由 Nicolai Stange 提交于
Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from the current 4.4 release cycle introduced a new flags member in struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA from struct socket's flags member into that new place. Unfortunately, the new flags field is never initialized properly, at least not for the struct socket_wq instance created in sock_alloc_inode(). One particular issue I encountered because of this is that my GNU Emacs failed to draw anything on my desktop -- i.e. what I got is a transparent window, including the title bar. Bisection lead to the commit mentioned above and further investigation by means of strace told me that Emacs is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is reproducible 100% of times and the fact that properly initializing the struct socket_wq ->flags fixes the issue leads me to the conclusion that somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags, preventing my Emacs from receiving any SIGIO's due to data becoming available and it got stuck. Make sock_alloc_inode() set the newly created struct socket_wq's ->flags member to zero. Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection") Signed-off-by: NNicolai Stange <nicstange@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 12月, 2015 1 次提交
-
-
由 tadeusz.struk@intel.com 提交于
msg_iocb needs to be initialized on the recv/recvfrom path. Otherwise afalg will wrongly interpret it as an async call. Cc: stable@vger.kernel.org Reported-by: NHarald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2015 2 次提交
-
-
由 Eric Dumazet 提交于
Dmitry provided a syzkaller (http://github.com/google/syzkaller) triggering a fault in sock_wake_async() when async IO is requested. Said program stressed af_unix sockets, but the issue is generic and should be addressed in core networking stack. The problem is that by the time sock_wake_async() is called, we should not access the @flags field of 'struct socket', as the inode containing this socket might be freed without further notice, and without RCU grace period. We already maintain an RCU protected structure, "struct socket_wq" so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it is the safe route. It also reduces number of cache lines needing dirtying, so might provide a performance improvement anyway. In followup patches, we might move remaining flags (SOCK_NOSPACE, SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket' being mostly read and let it being shared between cpus. Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch is a cleanup to make following patch easier to review. Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA from (struct socket)->flags to a (struct socket_wq)->flags to benefit from RCU protection in sock_wake_async() To ease backports, we rename both constants. Two new helpers, sk_set_bit(int nr, struct sock *sk) and sk_clear_bit(int net, struct sock *sk) are added so that following patch can change their implementation. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2015 1 次提交
-
-
由 Viresh Kumar 提交于
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there is no need to do that again from its callers. Drop it. Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 11 5月, 2015 2 次提交
-
-
由 Eric W. Biederman 提交于
This is long overdue, and is part of cleaning up how we allocate kernel sockets that don't reference count struct net. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
There is no need for tun to do the weird network namespace refcounting. The existing network namespace refcounting in tfile has almost exactly the same lifetime. So rewrite the code to use the struct sock network namespace refcounting and remove the unnecessary hand rolled network namespace refcounting and the unncesary tfile->net. This change allows the tun code to directly call sock_put bypassing sock_release and making SOCK_EXTERNALLY_ALLOCATED unnecessary. Remove the now unncessary tun_release so that if anything tries to use the sock_release code path the kernel will oops, and let us know about the bug. The macvtap code already uses it's internal socket this way. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 4月, 2015 1 次提交
-
-
由 David Howells 提交于
socket inodes and sunrpc filesystems - inodes owned by that code Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 12 4月, 2015 3 次提交
-
-
由 Al Viro 提交于
All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
convert open-coded instances Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 4月, 2015 3 次提交
-
-
由 Al Viro 提交于
For kernel_sendmsg() that eliminates the need to play with setfs(); for kernel_recvmsg() it does *not* - a couple of callers are using it with non-NULL ->msg_control, which would be treated as userland address on recvmsg side of things. In all cases we are really setting a kvec-backed iov_iter, though. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 24 3月, 2015 1 次提交
-
-
由 tadeusz.struk@intel.com 提交于
Add support for async operations. Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 3月, 2015 1 次提交
-
-
由 Al Viro 提交于
Cc: stable@vger.kernel.org # v3.19 Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
The AIO interface is fairly complex because it tries to allow filesystems to always work async and then wakeup a synchronous caller through aio_complete. It turns out that basically no one was doing this to avoid the complexity and context switches, and we've already fixed up the remaining users and can now get rid of this case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 13 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 3月, 2015 1 次提交
-
-
由 Ying Xue 提交于
After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2015 1 次提交
-
-
由 Eyal Birger 提交于
Commit 97775007 ("af_packet: add interframe drop cmsg (v6)") unionized skb->mark and skb->dropcount in order to allow recording of the socket drop count while maintaining struct sk_buff size. skb->dropcount was introduced since there was no available room in skb->cb[] in packet sockets. However, its introduction led to the inability to export skb->mark, or any other aliased field to userspace if so desired. Moving the dropcount metric to skb->cb[] eliminates this problem at the expense of 4 bytes less in skb->cb[] for protocol families using it. Signed-off-by: NEyal Birger <eyal.birger@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2015 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 1月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
The sock_iocb structure is allocate on stack for each read/write-like operation on sockets, and contains various fields of which only the embedded msghdr and sometimes a pointer to the scm_cookie is ever used. Get rid of the sock_iocb and put a msghdr directly on the stack and pass the scm_cookie explicitly to netlink_mmap_sendmsg. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 1月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 1月, 2015 1 次提交
-
-
由 Nicolas Dichtel 提交于
This field already contains the length of the iovec, no need to calculate it again. Suggested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 1月, 2015 1 次提交
-
-
由 Nicolas Dichtel 提交于
Better to use available helpers. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 12月, 2014 1 次提交
-
-
由 Al Viro 提交于
Reported-by: NPavel Emelyanov <xemul@parallels.com> Acked-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 11 12月, 2014 1 次提交
-
-
由 Al Viro 提交于
As it is, default ->i_fop has NULL ->open() (along with all other methods). The only case where it matters is reopening (via procfs symlink) a file that didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned to something sane (default would fail on read/write/ioctl/etc.). Unfortunately, such case exists - alloc_file() users, especially anon_get_file() ones. There we have tons of opened files of very different kinds sharing the same inode. As the result, attempt to reopen those via procfs succeeds and you get a descriptor you can't do anything with. Moreover, in case of sockets we set ->i_fop that will only be used on such reopen attempts - and put a failing ->open() into it to make sure those do not succeed. It would be simpler to put such ->open() into default ->i_fop and leave it unchanged both for anon inode (as we do anyway) and for socket ones. Result: * everything going through do_dentry_open() works as it used to * sock_no_open() kludge is gone * attempts to reopen anon-inode files fail as they really ought to * ditto for aio_private_file() * ditto for perfmon - this one actually tried to imitate sock_no_open() trick, but failed to set ->i_fop, so in the current tree reopens succeed and yield completely useless descriptor. Intent clearly had been to fail with -ENXIO on such reopens; now it actually does. * everything else that used alloc_file() keeps working - it has ->i_fop set for its inodes anyway Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 12月, 2014 2 次提交
-
-
由 Al Viro 提交于
Note that the code _using_ ->msg_iter at that point will be very unhappy with anything other than unshifted iovec-backed iov_iter. We still need to convert users to proper primitives. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Gu Zheng 提交于
Introduce helper function do_sock_sendmsg() to simplify sock_sendmsg{_nosec}, and replace reduplicate code. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2014 3 次提交
-
-
由 Al Viro 提交于
... and do the same on the compat side of things. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
use {compat_,}rw_copy_check_uvector(). As the result, we are guaranteed that all iovecs seen in ->msg_iov by ->sendmsg() and ->recvmsg() will pass access_ok(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Kernel-side struct msghdr is (currently) using the same layout as userland one, but it's not a one-to-one copy - even without considering 32bit compat issues, we have msg_iov, msg_name and msg_control copied to kernel[1]. It's fairly localized, so we get away with a few functions where that knowledge is needed (and we could shrink that set even more). Pretty much everything deals with the kernel-side variant and the few places that want userland one just use a bunch of force-casts to paper over the differences. The thing is, kernel-side definition of struct msghdr is *not* exposed in include/uapi - libc doesn't see it, etc. So we can add struct user_msghdr, with proper annotations and let the few places that ever deal with those beasts use it for userland pointers. Saner typechecking aside, that will allow to change the layout of kernel-side msghdr - e.g. replace msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need to modify the iovec as we copy data to/from it, etc. We could introduce kernel_msghdr instead, but that would create much more noise - the absolute majority of the instances would need to have the type switched to kernel_msghdr and definition of struct msghdr in include/linux/socket.h is not going to be seen by userland anyway. This commit just introduces user_msghdr and switches the few places that are dealing with userland-side msghdr to it. [1] actually, it's even trickier than that - we copy msg_control for sendmsg, but keep the userland address on recvmsg. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 9月, 2014 1 次提交
-
-
由 Ani Sinha 提交于
Linux manpage for recvmsg and sendmsg calls does not explicitly mention setting msg_namelen to 0 when msg_name passed set as NULL. When developers don't set msg_namelen member in msghdr, it might contain garbage value which will fail the validation check and sendmsg and recvmsg calls from kernel will return EINVAL. This will break old binaries and any code for which there is no access to source code. To fix this, we set msg_namelen to 0 when msg_name is passed as NULL from userland. Signed-off-by: NAni Sinha <ani@arista.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-