1. 09 4月, 2015 1 次提交
  2. 24 3月, 2015 1 次提交
  3. 21 3月, 2015 1 次提交
    • C
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · 91edd096
      Catalin Marinas 提交于
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91edd096
  4. 24 2月, 2015 1 次提交
    • C
      net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg · d720d8ce
      Catalin Marinas 提交于
      With commit a7526eb5 (net: Unbreak compat_sys_{send,recv}msg), the
      MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points,
      changing the kernel compat behaviour from the one before the commit it
      was trying to fix (1be374a0, net: Block MSG_CMSG_COMPAT in
      send(m)msg and recv(m)msg).
      
      On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native
      32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by
      the kernel). However, on a 64-bit kernel, the compat ABI is different
      with commit a7526eb5.
      
      This patch changes the compat_sys_{send,recv}msg behaviour to the one
      prior to commit 1be374a0.
      
      The problem was found running 32-bit LTP (sendmsg01) binary on an arm64
      kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg()
      but the general rule is not to break user ABI (even when the user
      behaviour is not entirely sane).
      
      Fixes: a7526eb5 (net: Unbreak compat_sys_{send,recv}msg)
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d720d8ce
  5. 23 2月, 2015 1 次提交
  6. 10 12月, 2014 1 次提交
  7. 20 11月, 2014 3 次提交
    • A
      fold verify_iovec() into copy_msghdr_from_user() · 08adb7da
      Al Viro 提交于
      ... and do the same on the compat side of things.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      08adb7da
    • A
      {compat_,}verify_iovec(): switch to generic copying of iovecs · 08449320
      Al Viro 提交于
      use {compat_,}rw_copy_check_uvector().  As the result, we are
      guaranteed that all iovecs seen in ->msg_iov by ->sendmsg()
      and ->recvmsg() will pass access_ok().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      08449320
    • A
      separate kernel- and userland-side msghdr · 666547ff
      Al Viro 提交于
      Kernel-side struct msghdr is (currently) using the same layout as
      userland one, but it's not a one-to-one copy - even without considering
      32bit compat issues, we have msg_iov, msg_name and msg_control copied
      to kernel[1].  It's fairly localized, so we get away with a few functions
      where that knowledge is needed (and we could shrink that set even
      more).  Pretty much everything deals with the kernel-side variant and
      the few places that want userland one just use a bunch of force-casts
      to paper over the differences.
      
      The thing is, kernel-side definition of struct msghdr is *not* exposed
      in include/uapi - libc doesn't see it, etc.  So we can add struct user_msghdr,
      with proper annotations and let the few places that ever deal with those
      beasts use it for userland pointers.  Saner typechecking aside, that will
      allow to change the layout of kernel-side msghdr - e.g. replace
      msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
      to modify the iovec as we copy data to/from it, etc.
      
      We could introduce kernel_msghdr instead, but that would create much more
      noise - the absolute majority of the instances would need to have the
      type switched to kernel_msghdr and definition of struct msghdr in
      include/linux/socket.h is not going to be seen by userland anyway.
      
      This commit just introduces user_msghdr and switches the few places that
      are dealing with userland-side msghdr to it.
      
      [1] actually, it's even trickier than that - we copy msg_control for
      sendmsg, but keep the userland address on recvmsg.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      666547ff
  8. 30 7月, 2014 1 次提交
    • A
      net: sendmsg: fix NULL pointer dereference · 40eea803
      Andrey Ryabinin 提交于
      Sasha's report:
      	> While fuzzing with trinity inside a KVM tools guest running the latest -next
      	> kernel with the KASAN patchset, I've stumbled on the following spew:
      	>
      	> [ 4448.949424] ==================================================================
      	> [ 4448.951737] AddressSanitizer: user-memory-access on address 0
      	> [ 4448.952988] Read of size 2 by thread T19638:
      	> [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
      	> [ 4448.956823]  ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
      	> [ 4448.958233]  ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
      	> [ 4448.959552]  0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
      	> [ 4448.961266] Call Trace:
      	> [ 4448.963158] dump_stack (lib/dump_stack.c:52)
      	> [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
      	> [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
      	> [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
      	> [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
      	> [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
      	> [ 4448.970103] sock_sendmsg (net/socket.c:654)
      	> [ 4448.971584] ? might_fault (mm/memory.c:3741)
      	> [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
      	> [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
      	> [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
      	> [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
      	> [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
      	> [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
      	> [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
      	> [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
      	> [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
      	> [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
      	> [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
      	> [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
      	> [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
      	> [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
      	> [ 4448.988929] ==================================================================
      
      This reports means that we've come to netlink_sendmsg() with msg->msg_name == NULL and msg->msg_namelen > 0.
      
      After this report there was no usual "Unable to handle kernel NULL pointer dereference"
      and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.
      
      This bug was introduced in f3d33426
      (net: rework recvmsg handler msg_name and msg_namelen logic).
      Commit message states that:
      	"Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      	 non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      	 affect sendto as it would bail out earlier while trying to copy-in the
      	 address."
      But in fact this affects sendto when address 0 is mapped and contains
      socket address structure in it. In such case copy-in address will succeed,
      verify_iovec() function will successfully exit with msg->msg_namelen > 0
      and msg->msg_name == NULL.
      
      This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.
      
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: <stable@vger.kernel.org>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40eea803
  9. 06 3月, 2014 2 次提交
  10. 31 1月, 2014 1 次提交
    • P
      x86, x32: Correct invalid use of user timespec in the kernel · 2def2ef2
      PaX Team 提交于
      The x32 case for the recvmsg() timout handling is broken:
      
        asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
                                            unsigned int vlen, unsigned int flags,
                                            struct compat_timespec __user *timeout)
        {
                int datagrams;
                struct timespec ktspec;
      
                if (flags & MSG_CMSG_COMPAT)
                        return -EINVAL;
      
                if (COMPAT_USE_64BIT_TIME)
                        return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
                                              flags | MSG_CMSG_COMPAT,
                                              (struct timespec *) timeout);
                ...
      
      The timeout pointer parameter is provided by userland (hence the __user
      annotation) but for x32 syscalls it's simply cast to a kernel pointer
      and is passed to __sys_recvmmsg which will eventually directly
      dereference it for both reading and writing.  Other callers to
      __sys_recvmmsg properly copy from userland to the kernel first.
      
      The bug was introduced by commit ee4fa23c ("compat: Use
      COMPAT_USE_64BIT_TIME in net/compat.c") and should affect all kernels
      since 3.4 (and perhaps vendor kernels if they backported x32 support
      along with this code).
      
      Note that CONFIG_X86_X32_ABI gets enabled at build time and only if
      CONFIG_X86_X32 is enabled and ld can build x32 executables.
      
      Other uses of COMPAT_USE_64BIT_TIME seem fine.
      
      This addresses CVE-2014-0038.
      Signed-off-by: NPaX Team <pageexec@freemail.hu>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v3.4+
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2def2ef2
  11. 30 11月, 2013 1 次提交
  12. 21 11月, 2013 1 次提交
    • H
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa 提交于
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3d33426
  13. 04 10月, 2013 1 次提交
  14. 07 6月, 2013 1 次提交
  15. 27 9月, 2012 1 次提交
  16. 23 7月, 2012 1 次提交
  17. 16 4月, 2012 1 次提交
  18. 14 4月, 2012 1 次提交
  19. 12 3月, 2012 1 次提交
  20. 21 2月, 2012 1 次提交
  21. 01 11月, 2011 1 次提交
  22. 06 5月, 2011 1 次提交
    • A
      net: Add sendmmsg socket system call · 228e548e
      Anton Blanchard 提交于
      This patch adds a multiple message send syscall and is the send
      version of the existing recvmmsg syscall. This is heavily
      based on the patch by Arnaldo that added recvmmsg.
      
      I wrote a microbenchmark to test the performance gains of using
      this new syscall:
      
      http://ozlabs.org/~anton/junkcode/sendmmsg_test.c
      
      The test was run on a ppc64 box with a 10 Gbit network card. The
      benchmark can send both UDP and RAW ethernet packets.
      
      64B UDP
      
      batch   pkts/sec
      1       804570
      2       872800 (+ 8 %)
      4       916556 (+14 %)
      8       939712 (+17 %)
      16      952688 (+18 %)
      32      956448 (+19 %)
      64      964800 (+20 %)
      
      64B raw socket
      
      batch   pkts/sec
      1       1201449
      2       1350028 (+12 %)
      4       1461416 (+22 %)
      8       1513080 (+26 %)
      16      1541216 (+28 %)
      32      1553440 (+29 %)
      64      1557888 (+30 %)
      
      We see a 20% improvement in throughput on UDP send and 30%
      on raw socket send.
      
      [ Add sparc syscall entries. -DaveM ]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      228e548e
  23. 29 10月, 2010 1 次提交
    • D
      net: Limit socket I/O iovec total length to INT_MAX. · 8acfe468
      David S. Miller 提交于
      This helps protect us from overflow issues down in the
      individual protocol sendmsg/recvmsg handlers.  Once
      we hit INT_MAX we truncate out the rest of the iovec
      by setting the iov_len members to zero.
      
      This works because:
      
      1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial
         writes are allowed and the application will just continue
         with another write to send the rest of the data.
      
      2) For datagram oriented sockets, where there must be a
         one-to-one correspondance between write() calls and
         packets on the wire, INT_MAX is going to be far larger
         than the packet size limit the protocol is going to
         check for and signal with -EMSGSIZE.
      
      Based upon a patch by Linus Torvalds.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8acfe468
  24. 04 6月, 2010 1 次提交
    • E
      From abbffa2aa9bd6f8df16d0d0a102af677510d8b9a Mon Sep 17 00:00:00 2001 · c6d409cf
      Eric Dumazet 提交于
      From: Eric Dumazet <eric.dumazet@gmail.com>
      Date: Thu, 3 Jun 2010 04:29:41 +0000
      Subject: [PATCH 2/3] net: net/socket.c and net/compat.c cleanups
      
      cleanup patch, to match modern coding style.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ---
       net/compat.c |   47 ++++++++---------
       net/socket.c |  165 ++++++++++++++++++++++++++++------------------------------
       2 files changed, 102 insertions(+), 110 deletions(-)
      
      diff --git a/net/compat.c b/net/compat.c
      index 1cf7590..63d260e 100644
      --- a/net/compat.c
      +++ b/net/compat.c
      @@ -81,7 +81,7 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
       	int tot_len;
      
       	if (kern_msg->msg_namelen) {
      -		if (mode==VERIFY_READ) {
      +		if (mode == VERIFY_READ) {
       			int err = move_addr_to_kernel(kern_msg->msg_name,
       						      kern_msg->msg_namelen,
       						      kern_address);
      @@ -354,7 +354,7 @@ static int do_set_attach_filter(struct socket *sock, int level, int optname,
       static int do_set_sock_timeout(struct socket *sock, int level,
       		int optname, char __user *optval, unsigned int optlen)
       {
      -	struct compat_timeval __user *up = (struct compat_timeval __user *) optval;
      +	struct compat_timeval __user *up = (struct compat_timeval __user *)optval;
       	struct timeval ktime;
       	mm_segment_t old_fs;
       	int err;
      @@ -367,7 +367,7 @@ static int do_set_sock_timeout(struct socket *sock, int level,
       		return -EFAULT;
       	old_fs = get_fs();
       	set_fs(KERNEL_DS);
      -	err = sock_setsockopt(sock, level, optname, (char *) &ktime, sizeof(ktime));
      +	err = sock_setsockopt(sock, level, optname, (char *)&ktime, sizeof(ktime));
       	set_fs(old_fs);
      
       	return err;
      @@ -389,11 +389,10 @@ asmlinkage long compat_sys_setsockopt(int fd, int level, int optname,
       				char __user *optval, unsigned int optlen)
       {
       	int err;
      -	struct socket *sock;
      +	struct socket *sock = sockfd_lookup(fd, &err);
      
      -	if ((sock = sockfd_lookup(fd, &err))!=NULL)
      -	{
      -		err = security_socket_setsockopt(sock,level,optname);
      +	if (sock) {
      +		err = security_socket_setsockopt(sock, level, optname);
       		if (err) {
       			sockfd_put(sock);
       			return err;
      @@ -453,7 +452,7 @@ static int compat_sock_getsockopt(struct socket *sock, int level, int optname,
       int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
       {
       	struct compat_timeval __user *ctv =
      -			(struct compat_timeval __user*) userstamp;
      +			(struct compat_timeval __user *) userstamp;
       	int err = -ENOENT;
       	struct timeval tv;
      
      @@ -477,7 +476,7 @@ EXPORT_SYMBOL(compat_sock_get_timestamp);
       int compat_sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
       {
       	struct compat_timespec __user *ctv =
      -			(struct compat_timespec __user*) userstamp;
      +			(struct compat_timespec __user *) userstamp;
       	int err = -ENOENT;
       	struct timespec ts;
      
      @@ -502,12 +501,10 @@ asmlinkage long compat_sys_getsockopt(int fd, int level, int optname,
       				char __user *optval, int __user *optlen)
       {
       	int err;
      -	struct socket *sock;
      +	struct socket *sock = sockfd_lookup(fd, &err);
      
      -	if ((sock = sockfd_lookup(fd, &err))!=NULL)
      -	{
      -		err = security_socket_getsockopt(sock, level,
      -							   optname);
      +	if (sock) {
      +		err = security_socket_getsockopt(sock, level, optname);
       		if (err) {
       			sockfd_put(sock);
       			return err;
      @@ -557,7 +554,7 @@ struct compat_group_filter {
      
       int compat_mc_setsockopt(struct sock *sock, int level, int optname,
       	char __user *optval, unsigned int optlen,
      -	int (*setsockopt)(struct sock *,int,int,char __user *,unsigned int))
      +	int (*setsockopt)(struct sock *, int, int, char __user *, unsigned int))
       {
       	char __user	*koptval = optval;
       	int		koptlen = optlen;
      @@ -640,12 +637,11 @@ int compat_mc_setsockopt(struct sock *sock, int level, int optname,
       	}
       	return setsockopt(sock, level, optname, koptval, koptlen);
       }
      -
       EXPORT_SYMBOL(compat_mc_setsockopt);
      
       int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       	char __user *optval, int __user *optlen,
      -	int (*getsockopt)(struct sock *,int,int,char __user *,int __user *))
      +	int (*getsockopt)(struct sock *, int, int, char __user *, int __user *))
       {
       	struct compat_group_filter __user *gf32 = (void *)optval;
       	struct group_filter __user *kgf;
      @@ -681,7 +677,7 @@ int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       	    __put_user(interface, &kgf->gf_interface) ||
       	    __put_user(fmode, &kgf->gf_fmode) ||
       	    __put_user(numsrc, &kgf->gf_numsrc) ||
      -	    copy_in_user(&kgf->gf_group,&gf32->gf_group,sizeof(kgf->gf_group)))
      +	    copy_in_user(&kgf->gf_group, &gf32->gf_group, sizeof(kgf->gf_group)))
       		return -EFAULT;
      
       	err = getsockopt(sock, level, optname, (char __user *)kgf, koptlen);
      @@ -714,21 +710,22 @@ int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       		copylen = numsrc * sizeof(gf32->gf_slist[0]);
       		if (copylen > klen)
       			copylen = klen;
      -	        if (copy_in_user(gf32->gf_slist, kgf->gf_slist, copylen))
      +		if (copy_in_user(gf32->gf_slist, kgf->gf_slist, copylen))
       			return -EFAULT;
       	}
       	return err;
       }
      -
       EXPORT_SYMBOL(compat_mc_getsockopt);
      
       /* Argument list sizes for compat_sys_socketcall */
       #define AL(x) ((x) * sizeof(u32))
      -static unsigned char nas[20]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
      -				AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
      -				AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
      -				AL(4),AL(5)};
      +static unsigned char nas[20] = {
      +	AL(0), AL(3), AL(3), AL(3), AL(2), AL(3),
      +	AL(3), AL(3), AL(4), AL(4), AL(4), AL(6),
      +	AL(6), AL(2), AL(5), AL(5), AL(3), AL(3),
      +	AL(4), AL(5)
      +};
       #undef AL
      
       asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg, unsigned flags)
      @@ -827,7 +824,7 @@ asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
       					  compat_ptr(a[4]), compat_ptr(a[5]));
       		break;
       	case SYS_SHUTDOWN:
      -		ret = sys_shutdown(a0,a1);
      +		ret = sys_shutdown(a0, a1);
       		break;
       	case SYS_SETSOCKOPT:
       		ret = compat_sys_setsockopt(a0, a1, a[2],
      diff --git a/net/socket.c b/net/socket.c
      index 367d547..b63c051 100644
      --- a/net/socket.c
      +++ b/net/socket.c
      @@ -124,7 +124,7 @@ static int sock_fasync(int fd, struct file *filp, int on);
       static ssize_t sock_sendpage(struct file *file, struct page *page,
       			     int offset, size_t size, loff_t *ppos, int more);
       static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
      -			        struct pipe_inode_info *pipe, size_t len,
      +				struct pipe_inode_info *pipe, size_t len,
       				unsigned int flags);
      
       /*
      @@ -162,7 +162,7 @@ static const struct net_proto_family *net_families[NPROTO] __read_mostly;
        *	Statistics counters of the socket lists
        */
      
      -static DEFINE_PER_CPU(int, sockets_in_use) = 0;
      +static DEFINE_PER_CPU(int, sockets_in_use);
      
       /*
        * Support routines.
      @@ -309,9 +309,9 @@ static int init_inodecache(void)
       }
      
       static const struct super_operations sockfs_ops = {
      -	.alloc_inode =	sock_alloc_inode,
      -	.destroy_inode =sock_destroy_inode,
      -	.statfs =	simple_statfs,
      +	.alloc_inode	= sock_alloc_inode,
      +	.destroy_inode	= sock_destroy_inode,
      +	.statfs		= simple_statfs,
       };
      
       static int sockfs_get_sb(struct file_system_type *fs_type,
      @@ -411,6 +411,7 @@ int sock_map_fd(struct socket *sock, int flags)
      
       	return fd;
       }
      +EXPORT_SYMBOL(sock_map_fd);
      
       static struct socket *sock_from_file(struct file *file, int *err)
       {
      @@ -422,7 +423,7 @@ static struct socket *sock_from_file(struct file *file, int *err)
       }
      
       /**
      - *	sockfd_lookup	- 	Go from a file number to its socket slot
      + *	sockfd_lookup - Go from a file number to its socket slot
        *	@fd: file handle
        *	@err: pointer to an error code return
        *
      @@ -450,6 +451,7 @@ struct socket *sockfd_lookup(int fd, int *err)
       		fput(file);
       	return sock;
       }
      +EXPORT_SYMBOL(sockfd_lookup);
      
       static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
       {
      @@ -540,6 +542,7 @@ void sock_release(struct socket *sock)
       	}
       	sock->file = NULL;
       }
      +EXPORT_SYMBOL(sock_release);
      
       int sock_tx_timestamp(struct msghdr *msg, struct sock *sk,
       		      union skb_shared_tx *shtx)
      @@ -586,6 +589,7 @@ int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
       		ret = wait_on_sync_kiocb(&iocb);
       	return ret;
       }
      +EXPORT_SYMBOL(sock_sendmsg);
      
       int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
       		   struct kvec *vec, size_t num, size_t size)
      @@ -604,6 +608,7 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
       	set_fs(oldfs);
       	return result;
       }
      +EXPORT_SYMBOL(kernel_sendmsg);
      
       static int ktime2ts(ktime_t kt, struct timespec *ts)
       {
      @@ -664,7 +669,6 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
       		put_cmsg(msg, SOL_SOCKET,
       			 SCM_TIMESTAMPING, sizeof(ts), &ts);
       }
      -
       EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
      
       inline void sock_recv_drops(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
      @@ -720,6 +724,7 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg,
       		ret = wait_on_sync_kiocb(&iocb);
       	return ret;
       }
      +EXPORT_SYMBOL(sock_recvmsg);
      
       static int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg,
       			      size_t size, int flags)
      @@ -752,6 +757,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
       	set_fs(oldfs);
       	return result;
       }
      +EXPORT_SYMBOL(kernel_recvmsg);
      
       static void sock_aio_dtor(struct kiocb *iocb)
       {
      @@ -774,7 +780,7 @@ static ssize_t sock_sendpage(struct file *file, struct page *page,
       }
      
       static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
      -			        struct pipe_inode_info *pipe, size_t len,
      +				struct pipe_inode_info *pipe, size_t len,
       				unsigned int flags)
       {
       	struct socket *sock = file->private_data;
      @@ -887,7 +893,7 @@ static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
        */
      
       static DEFINE_MUTEX(br_ioctl_mutex);
      -static int (*br_ioctl_hook) (struct net *, unsigned int cmd, void __user *arg) = NULL;
      +static int (*br_ioctl_hook) (struct net *, unsigned int cmd, void __user *arg);
      
       void brioctl_set(int (*hook) (struct net *, unsigned int, void __user *))
       {
      @@ -895,7 +901,6 @@ void brioctl_set(int (*hook) (struct net *, unsigned int, void __user *))
       	br_ioctl_hook = hook;
       	mutex_unlock(&br_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(brioctl_set);
      
       static DEFINE_MUTEX(vlan_ioctl_mutex);
      @@ -907,7 +912,6 @@ void vlan_ioctl_set(int (*hook) (struct net *, void __user *))
       	vlan_ioctl_hook = hook;
       	mutex_unlock(&vlan_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(vlan_ioctl_set);
      
       static DEFINE_MUTEX(dlci_ioctl_mutex);
      @@ -919,7 +923,6 @@ void dlci_ioctl_set(int (*hook) (unsigned int, void __user *))
       	dlci_ioctl_hook = hook;
       	mutex_unlock(&dlci_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(dlci_ioctl_set);
      
       static long sock_do_ioctl(struct net *net, struct socket *sock,
      @@ -1047,6 +1050,7 @@ out_release:
       	sock = NULL;
       	goto out;
       }
      +EXPORT_SYMBOL(sock_create_lite);
      
       /* No kernel lock held - perfect */
       static unsigned int sock_poll(struct file *file, poll_table *wait)
      @@ -1147,6 +1151,7 @@ call_kill:
       	rcu_read_unlock();
       	return 0;
       }
      +EXPORT_SYMBOL(sock_wake_async);
      
       static int __sock_create(struct net *net, int family, int type, int protocol,
       			 struct socket **res, int kern)
      @@ -1265,11 +1270,13 @@ int sock_create(int family, int type, int protocol, struct socket **res)
       {
       	return __sock_create(current->nsproxy->net_ns, family, type, protocol, res, 0);
       }
      +EXPORT_SYMBOL(sock_create);
      
       int sock_create_kern(int family, int type, int protocol, struct socket **res)
       {
       	return __sock_create(&init_net, family, type, protocol, res, 1);
       }
      +EXPORT_SYMBOL(sock_create_kern);
      
       SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
       {
      @@ -1474,7 +1481,8 @@ SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
       		goto out;
      
       	err = -ENFILE;
      -	if (!(newsock = sock_alloc()))
      +	newsock = sock_alloc();
      +	if (!newsock)
       		goto out_put;
      
       	newsock->type = sock->type;
      @@ -1861,8 +1869,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned, flags)
       	if (MSG_CMSG_COMPAT & flags) {
       		if (get_compat_msghdr(&msg_sys, msg_compat))
       			return -EFAULT;
      -	}
      -	else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
      +	} else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
       		return -EFAULT;
      
       	sock = sockfd_lookup_light(fd, &err, &fput_needed);
      @@ -1964,8 +1971,7 @@ static int __sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
       	if (MSG_CMSG_COMPAT & flags) {
       		if (get_compat_msghdr(msg_sys, msg_compat))
       			return -EFAULT;
      -	}
      -	else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
      +	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
       		return -EFAULT;
      
       	err = -EMSGSIZE;
      @@ -2191,10 +2197,10 @@ SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg,
       /* Argument list sizes for sys_socketcall */
       #define AL(x) ((x) * sizeof(unsigned long))
       static const unsigned char nargs[20] = {
      -	AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
      -	AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
      -	AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
      -	AL(4),AL(5)
      +	AL(0), AL(3), AL(3), AL(3), AL(2), AL(3),
      +	AL(3), AL(3), AL(4), AL(4), AL(4), AL(6),
      +	AL(6), AL(2), AL(5), AL(5), AL(3), AL(3),
      +	AL(4), AL(5)
       };
      
       #undef AL
      @@ -2340,6 +2346,7 @@ int sock_register(const struct net_proto_family *ops)
       	printk(KERN_INFO "NET: Registered protocol family %d\n", ops->family);
       	return err;
       }
      +EXPORT_SYMBOL(sock_register);
      
       /**
        *	sock_unregister - remove a protocol handler
      @@ -2366,6 +2373,7 @@ void sock_unregister(int family)
      
       	printk(KERN_INFO "NET: Unregistered protocol family %d\n", family);
       }
      +EXPORT_SYMBOL(sock_unregister);
      
       static int __init sock_init(void)
       {
      @@ -2490,13 +2498,13 @@ static int dev_ifconf(struct net *net, struct compat_ifconf __user *uifc32)
       		ifc.ifc_req = NULL;
       		uifc = compat_alloc_user_space(sizeof(struct ifconf));
       	} else {
      -		size_t len =((ifc32.ifc_len / sizeof (struct compat_ifreq)) + 1) *
      -			sizeof (struct ifreq);
      +		size_t len = ((ifc32.ifc_len / sizeof(struct compat_ifreq)) + 1) *
      +			sizeof(struct ifreq);
       		uifc = compat_alloc_user_space(sizeof(struct ifconf) + len);
       		ifc.ifc_len = len;
       		ifr = ifc.ifc_req = (void __user *)(uifc + 1);
       		ifr32 = compat_ptr(ifc32.ifcbuf);
      -		for (i = 0; i < ifc32.ifc_len; i += sizeof (struct compat_ifreq)) {
      +		for (i = 0; i < ifc32.ifc_len; i += sizeof(struct compat_ifreq)) {
       			if (copy_in_user(ifr, ifr32, sizeof(struct compat_ifreq)))
       				return -EFAULT;
       			ifr++;
      @@ -2516,9 +2524,9 @@ static int dev_ifconf(struct net *net, struct compat_ifconf __user *uifc32)
       	ifr = ifc.ifc_req;
       	ifr32 = compat_ptr(ifc32.ifcbuf);
       	for (i = 0, j = 0;
      -             i + sizeof (struct compat_ifreq) <= ifc32.ifc_len && j < ifc.ifc_len;
      -	     i += sizeof (struct compat_ifreq), j += sizeof (struct ifreq)) {
      -		if (copy_in_user(ifr32, ifr, sizeof (struct compat_ifreq)))
      +	     i + sizeof(struct compat_ifreq) <= ifc32.ifc_len && j < ifc.ifc_len;
      +	     i += sizeof(struct compat_ifreq), j += sizeof(struct ifreq)) {
      +		if (copy_in_user(ifr32, ifr, sizeof(struct compat_ifreq)))
       			return -EFAULT;
       		ifr32++;
       		ifr++;
      @@ -2567,7 +2575,7 @@ static int compat_siocwandev(struct net *net, struct compat_ifreq __user *uifr32
       	compat_uptr_t uptr32;
       	struct ifreq __user *uifr;
      
      -	uifr = compat_alloc_user_space(sizeof (*uifr));
      +	uifr = compat_alloc_user_space(sizeof(*uifr));
       	if (copy_in_user(uifr, uifr32, sizeof(struct compat_ifreq)))
       		return -EFAULT;
      
      @@ -2601,9 +2609,9 @@ static int bond_ioctl(struct net *net, unsigned int cmd,
       			return -EFAULT;
      
       		old_fs = get_fs();
      -		set_fs (KERNEL_DS);
      +		set_fs(KERNEL_DS);
       		err = dev_ioctl(net, cmd, &kifr);
      -		set_fs (old_fs);
      +		set_fs(old_fs);
      
       		return err;
       	case SIOCBONDSLAVEINFOQUERY:
      @@ -2710,9 +2718,9 @@ static int compat_sioc_ifmap(struct net *net, unsigned int cmd,
       		return -EFAULT;
      
       	old_fs = get_fs();
      -	set_fs (KERNEL_DS);
      +	set_fs(KERNEL_DS);
       	err = dev_ioctl(net, cmd, (void __user *)&ifr);
      -	set_fs (old_fs);
      +	set_fs(old_fs);
      
       	if (cmd == SIOCGIFMAP && !err) {
       		err = copy_to_user(uifr32, &ifr, sizeof(ifr.ifr_name));
      @@ -2734,7 +2742,7 @@ static int compat_siocshwtstamp(struct net *net, struct compat_ifreq __user *uif
       	compat_uptr_t uptr32;
       	struct ifreq __user *uifr;
      
      -	uifr = compat_alloc_user_space(sizeof (*uifr));
      +	uifr = compat_alloc_user_space(sizeof(*uifr));
       	if (copy_in_user(uifr, uifr32, sizeof(struct compat_ifreq)))
       		return -EFAULT;
      
      @@ -2750,20 +2758,20 @@ static int compat_siocshwtstamp(struct net *net, struct compat_ifreq __user *uif
       }
      
       struct rtentry32 {
      -	u32   		rt_pad1;
      +	u32		rt_pad1;
       	struct sockaddr rt_dst;         /* target address               */
       	struct sockaddr rt_gateway;     /* gateway addr (RTF_GATEWAY)   */
       	struct sockaddr rt_genmask;     /* target network mask (IP)     */
      -	unsigned short  rt_flags;
      -	short           rt_pad2;
      -	u32   		rt_pad3;
      -	unsigned char   rt_tos;
      -	unsigned char   rt_class;
      -	short           rt_pad4;
      -	short           rt_metric;      /* +1 for binary compatibility! */
      +	unsigned short	rt_flags;
      +	short		rt_pad2;
      +	u32		rt_pad3;
      +	unsigned char	rt_tos;
      +	unsigned char	rt_class;
      +	short		rt_pad4;
      +	short		rt_metric;      /* +1 for binary compatibility! */
       	/* char * */ u32 rt_dev;        /* forcing the device at add    */
      -	u32   		rt_mtu;         /* per route MTU/Window         */
      -	u32   		rt_window;      /* Window clamping              */
      +	u32		rt_mtu;         /* per route MTU/Window         */
      +	u32		rt_window;      /* Window clamping              */
       	unsigned short  rt_irtt;        /* Initial RTT                  */
       };
      
      @@ -2793,29 +2801,29 @@ static int routing_ioctl(struct net *net, struct socket *sock,
      
       	if (sock && sock->sk && sock->sk->sk_family == AF_INET6) { /* ipv6 */
       		struct in6_rtmsg32 __user *ur6 = argp;
      -		ret = copy_from_user (&r6.rtmsg_dst, &(ur6->rtmsg_dst),
      +		ret = copy_from_user(&r6.rtmsg_dst, &(ur6->rtmsg_dst),
       			3 * sizeof(struct in6_addr));
      -		ret |= __get_user (r6.rtmsg_type, &(ur6->rtmsg_type));
      -		ret |= __get_user (r6.rtmsg_dst_len, &(ur6->rtmsg_dst_len));
      -		ret |= __get_user (r6.rtmsg_src_len, &(ur6->rtmsg_src_len));
      -		ret |= __get_user (r6.rtmsg_metric, &(ur6->rtmsg_metric));
      -		ret |= __get_user (r6.rtmsg_info, &(ur6->rtmsg_info));
      -		ret |= __get_user (r6.rtmsg_flags, &(ur6->rtmsg_flags));
      -		ret |= __get_user (r6.rtmsg_ifindex, &(ur6->rtmsg_ifindex));
      +		ret |= __get_user(r6.rtmsg_type, &(ur6->rtmsg_type));
      +		ret |= __get_user(r6.rtmsg_dst_len, &(ur6->rtmsg_dst_len));
      +		ret |= __get_user(r6.rtmsg_src_len, &(ur6->rtmsg_src_len));
      +		ret |= __get_user(r6.rtmsg_metric, &(ur6->rtmsg_metric));
      +		ret |= __get_user(r6.rtmsg_info, &(ur6->rtmsg_info));
      +		ret |= __get_user(r6.rtmsg_flags, &(ur6->rtmsg_flags));
      +		ret |= __get_user(r6.rtmsg_ifindex, &(ur6->rtmsg_ifindex));
      
       		r = (void *) &r6;
       	} else { /* ipv4 */
       		struct rtentry32 __user *ur4 = argp;
      -		ret = copy_from_user (&r4.rt_dst, &(ur4->rt_dst),
      +		ret = copy_from_user(&r4.rt_dst, &(ur4->rt_dst),
       					3 * sizeof(struct sockaddr));
      -		ret |= __get_user (r4.rt_flags, &(ur4->rt_flags));
      -		ret |= __get_user (r4.rt_metric, &(ur4->rt_metric));
      -		ret |= __get_user (r4.rt_mtu, &(ur4->rt_mtu));
      -		ret |= __get_user (r4.rt_window, &(ur4->rt_window));
      -		ret |= __get_user (r4.rt_irtt, &(ur4->rt_irtt));
      -		ret |= __get_user (rtdev, &(ur4->rt_dev));
      +		ret |= __get_user(r4.rt_flags, &(ur4->rt_flags));
      +		ret |= __get_user(r4.rt_metric, &(ur4->rt_metric));
      +		ret |= __get_user(r4.rt_mtu, &(ur4->rt_mtu));
      +		ret |= __get_user(r4.rt_window, &(ur4->rt_window));
      +		ret |= __get_user(r4.rt_irtt, &(ur4->rt_irtt));
      +		ret |= __get_user(rtdev, &(ur4->rt_dev));
       		if (rtdev) {
      -			ret |= copy_from_user (devname, compat_ptr(rtdev), 15);
      +			ret |= copy_from_user(devname, compat_ptr(rtdev), 15);
       			r4.rt_dev = devname; devname[15] = 0;
       		} else
       			r4.rt_dev = NULL;
      @@ -2828,9 +2836,9 @@ static int routing_ioctl(struct net *net, struct socket *sock,
       		goto out;
       	}
      
      -	set_fs (KERNEL_DS);
      +	set_fs(KERNEL_DS);
       	ret = sock_do_ioctl(net, sock, cmd, (unsigned long) r);
      -	set_fs (old_fs);
      +	set_fs(old_fs);
      
       out:
       	return ret;
      @@ -2993,11 +3001,13 @@ int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
       {
       	return sock->ops->bind(sock, addr, addrlen);
       }
      +EXPORT_SYMBOL(kernel_bind);
      
       int kernel_listen(struct socket *sock, int backlog)
       {
       	return sock->ops->listen(sock, backlog);
       }
      +EXPORT_SYMBOL(kernel_listen);
      
       int kernel_accept(struct socket *sock, struct socket **newsock, int flags)
       {
      @@ -3022,24 +3032,28 @@ int kernel_accept(struct socket *sock, struct socket **newsock, int flags)
       done:
       	return err;
       }
      +EXPORT_SYMBOL(kernel_accept);
      
       int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
       		   int flags)
       {
       	return sock->ops->connect(sock, addr, addrlen, flags);
       }
      +EXPORT_SYMBOL(kernel_connect);
      
       int kernel_getsockname(struct socket *sock, struct sockaddr *addr,
       			 int *addrlen)
       {
       	return sock->ops->getname(sock, addr, addrlen, 0);
       }
      +EXPORT_SYMBOL(kernel_getsockname);
      
       int kernel_getpeername(struct socket *sock, struct sockaddr *addr,
       			 int *addrlen)
       {
       	return sock->ops->getname(sock, addr, addrlen, 1);
       }
      +EXPORT_SYMBOL(kernel_getpeername);
      
       int kernel_getsockopt(struct socket *sock, int level, int optname,
       			char *optval, int *optlen)
      @@ -3056,6 +3070,7 @@ int kernel_getsockopt(struct socket *sock, int level, int optname,
       	set_fs(oldfs);
       	return err;
       }
      +EXPORT_SYMBOL(kernel_getsockopt);
      
       int kernel_setsockopt(struct socket *sock, int level, int optname,
       			char *optval, unsigned int optlen)
      @@ -3072,6 +3087,7 @@ int kernel_setsockopt(struct socket *sock, int level, int optname,
       	set_fs(oldfs);
       	return err;
       }
      +EXPORT_SYMBOL(kernel_setsockopt);
      
       int kernel_sendpage(struct socket *sock, struct page *page, int offset,
       		    size_t size, int flags)
      @@ -3083,6 +3099,7 @@ int kernel_sendpage(struct socket *sock, struct page *page, int offset,
      
       	return sock_no_sendpage(sock, page, offset, size, flags);
       }
      +EXPORT_SYMBOL(kernel_sendpage);
      
       int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg)
       {
      @@ -3095,33 +3112,11 @@ int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg)
      
       	return err;
       }
      +EXPORT_SYMBOL(kernel_sock_ioctl);
      
       int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
       {
       	return sock->ops->shutdown(sock, how);
       }
      -
      -EXPORT_SYMBOL(sock_create);
      -EXPORT_SYMBOL(sock_create_kern);
      -EXPORT_SYMBOL(sock_create_lite);
      -EXPORT_SYMBOL(sock_map_fd);
      -EXPORT_SYMBOL(sock_recvmsg);
      -EXPORT_SYMBOL(sock_register);
      -EXPORT_SYMBOL(sock_release);
      -EXPORT_SYMBOL(sock_sendmsg);
      -EXPORT_SYMBOL(sock_unregister);
      -EXPORT_SYMBOL(sock_wake_async);
      -EXPORT_SYMBOL(sockfd_lookup);
      -EXPORT_SYMBOL(kernel_sendmsg);
      -EXPORT_SYMBOL(kernel_recvmsg);
      -EXPORT_SYMBOL(kernel_bind);
      -EXPORT_SYMBOL(kernel_listen);
      -EXPORT_SYMBOL(kernel_accept);
      -EXPORT_SYMBOL(kernel_connect);
      -EXPORT_SYMBOL(kernel_getsockname);
      -EXPORT_SYMBOL(kernel_getpeername);
      -EXPORT_SYMBOL(kernel_getsockopt);
      -EXPORT_SYMBOL(kernel_setsockopt);
      -EXPORT_SYMBOL(kernel_sendpage);
      -EXPORT_SYMBOL(kernel_sock_ioctl);
       EXPORT_SYMBOL(kernel_sock_shutdown);
      +
      --
      1.7.0.4
      c6d409cf
  25. 03 6月, 2010 1 次提交
  26. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  27. 12 12月, 2009 2 次提交
  28. 02 12月, 2009 1 次提交
  29. 29 10月, 2009 1 次提交
  30. 13 10月, 2009 1 次提交
    • A
      net: Introduce recvmmsg socket syscall · a2e27255
      Arnaldo Carvalho de Melo 提交于
      Meaning receive multiple messages, reducing the number of syscalls and
      net stack entry/exit operations.
      
      Next patches will introduce mechanisms where protocols that want to
      optimize this operation will provide an unlocked_recvmsg operation.
      
      This takes into account comments made by:
      
      . Paul Moore: sock_recvmsg is called only for the first datagram,
        sock_recvmsg_nosec is used for the rest.
      
      . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
        works in the same fashion as the ppoll one.
      
        If the underlying protocol returns a datagram with MSG_OOB set, this
        will make recvmmsg return right away with as many datagrams (+ the OOB
        one) it has received so far.
      
      . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
        datagrams and then recvmsg returns an error, recvmmsg will return
        the successfully received datagrams, store the error and return it
        in the next call.
      
      This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
      where we will be able to acquire the lock only at batch start and end, not at
      every underlying recvmsg call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e27255
  31. 01 10月, 2009 1 次提交
  32. 15 7月, 2009 1 次提交
    • J
      net/compat/wext: send different messages to compat tasks · 1dacc76d
      Johannes Berg 提交于
      Wireless extensions have the unfortunate problem that events
      are multicast netlink messages, and are not independent of
      pointer size. Thus, currently 32-bit tasks on 64-bit platforms
      cannot properly receive events and fail with all kinds of
      strange problems, for instance wpa_supplicant never notices
      disassociations, due to the way the 64-bit event looks (to a
      32-bit process), the fact that the address is all zeroes is
      lost, it thinks instead it is 00:00:00:00:01:00.
      
      The same problem existed with the ioctls, until David Miller
      fixed those some time ago in an heroic effort.
      
      A different problem caused by this is that we cannot send the
      ASSOCREQIE/ASSOCRESPIE events because sending them causes a
      32-bit wpa_supplicant on a 64-bit system to overwrite its
      internal information, which is worse than it not getting the
      information at all -- so we currently resort to sending a
      custom string event that it then parses. This, however, has a
      severe size limitation we are frequently hitting with modern
      access points; this limitation would can be lifted after this
      patch by sending the correct binary, not custom, event.
      
      A similar problem apparently happens for some other netlink
      users on x86_64 with 32-bit tasks due to the alignment for
      64-bit quantities.
      
      In order to fix these problems, I have implemented a way to
      send compat messages to tasks. When sending an event, we send
      the non-compat event data together with a compat event data in
      skb_shinfo(main_skb)->frag_list. Then, when the event is read
      from the socket, the netlink code makes sure to pass out only
      the skb that is compatible with the task. This approach was
      suggested by David Miller, my original approach required
      always sending two skbs but that had various small problems.
      
      To determine whether compat is needed or not, I have used the
      MSG_CMSG_COMPAT flag, and adjusted the call path for recv and
      recvfrom to include it, even if those calls do not have a cmsg
      parameter.
      
      I have not solved one small part of the problem, and I don't
      think it is necessary to: if a 32-bit application uses read()
      rather than any form of recvmsg() it will still get the wrong
      (64-bit) event. However, neither do applications actually do
      this, nor would it be a regression.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dacc76d
  33. 16 2月, 2009 1 次提交
  34. 20 11月, 2008 1 次提交
    • U
      reintroduce accept4 · de11defe
      Ulrich Drepper 提交于
      Introduce a new accept4() system call.  The addition of this system call
      matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(),
      inotify_init1(), epoll_create1(), pipe2()) which added new system calls
      that differed from analogous traditional system calls in adding a flags
      argument that can be used to access additional functionality.
      
      The accept4() system call is exactly the same as accept(), except that
      it adds a flags bit-mask argument.  Two flags are initially implemented.
      (Most of the new system calls in 2.6.27 also had both of these flags.)
      
      SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled
      for the new file descriptor returned by accept4().  This is a useful
      security feature to avoid leaking information in a multithreaded
      program where one thread is doing an accept() at the same time as
      another thread is doing a fork() plus exec().  More details here:
      http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling",
      Ulrich Drepper).
      
      The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag
      to be enabled on the new open file description created by accept4().
      (This flag is merely a convenience, saving the use of additional calls
      fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result.
      
      Here's a test program.  Works on x86-32.  Should work on x86-64, but
      I (mtk) don't have a system to hand to test with.
      
      It tests accept4() with each of the four possible combinations of
      SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies
      that the appropriate flags are set on the file descriptor/open file
      description returned by accept4().
      
      I tested Ulrich's patch in this thread by applying against 2.6.28-rc2,
      and it passes according to my test program.
      
      /* test_accept4.c
      
        Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk
             <mtk.manpages@gmail.com>
      
        Licensed under the GNU GPLv2 or later.
      */
      #define _GNU_SOURCE
      #include <unistd.h>
      #include <sys/syscall.h>
      #include <sys/socket.h>
      #include <netinet/in.h>
      #include <stdlib.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <string.h>
      
      #define PORT_NUM 33333
      
      #define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)
      
      /**********************************************************************/
      
      /* The following is what we need until glibc gets a wrapper for
        accept4() */
      
      /* Flags for socket(), socketpair(), accept4() */
      #ifndef SOCK_CLOEXEC
      #define SOCK_CLOEXEC    O_CLOEXEC
      #endif
      #ifndef SOCK_NONBLOCK
      #define SOCK_NONBLOCK   O_NONBLOCK
      #endif
      
      #ifdef __x86_64__
      #define SYS_accept4 288
      #elif __i386__
      #define USE_SOCKETCALL 1
      #define SYS_ACCEPT4 18
      #else
      #error "Sorry -- don't know the syscall # on this architecture"
      #endif
      
      static int
      accept4(int fd, struct sockaddr *sockaddr, socklen_t *addrlen, int flags)
      {
         printf("Calling accept4(): flags = %x", flags);
         if (flags != 0) {
             printf(" (");
             if (flags & SOCK_CLOEXEC)
                 printf("SOCK_CLOEXEC");
             if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK))
                 printf(" ");
             if (flags & SOCK_NONBLOCK)
                 printf("SOCK_NONBLOCK");
             printf(")");
         }
         printf("\n");
      
      #if USE_SOCKETCALL
         long args[6];
      
         args[0] = fd;
         args[1] = (long) sockaddr;
         args[2] = (long) addrlen;
         args[3] = flags;
      
         return syscall(SYS_socketcall, SYS_ACCEPT4, args);
      #else
         return syscall(SYS_accept4, fd, sockaddr, addrlen, flags);
      #endif
      }
      
      /**********************************************************************/
      
      static int
      do_test(int lfd, struct sockaddr_in *conn_addr,
             int closeonexec_flag, int nonblock_flag)
      {
         int connfd, acceptfd;
         int fdf, flf, fdf_pass, flf_pass;
         struct sockaddr_in claddr;
         socklen_t addrlen;
      
         printf("=======================================\n");
      
         connfd = socket(AF_INET, SOCK_STREAM, 0);
         if (connfd == -1)
             die("socket");
         if (connect(connfd, (struct sockaddr *) conn_addr,
                     sizeof(struct sockaddr_in)) == -1)
             die("connect");
      
         addrlen = sizeof(struct sockaddr_in);
         acceptfd = accept4(lfd, (struct sockaddr *) &claddr, &addrlen,
                            closeonexec_flag | nonblock_flag);
         if (acceptfd == -1) {
             perror("accept4()");
             close(connfd);
             return 0;
         }
      
         fdf = fcntl(acceptfd, F_GETFD);
         if (fdf == -1)
             die("fcntl:F_GETFD");
         fdf_pass = ((fdf & FD_CLOEXEC) != 0) ==
                    ((closeonexec_flag & SOCK_CLOEXEC) != 0);
         printf("Close-on-exec flag is %sset (%s); ",
                 (fdf & FD_CLOEXEC) ? "" : "not ",
                 fdf_pass ? "OK" : "failed");
      
         flf = fcntl(acceptfd, F_GETFL);
         if (flf == -1)
             die("fcntl:F_GETFD");
         flf_pass = ((flf & O_NONBLOCK) != 0) ==
                    ((nonblock_flag & SOCK_NONBLOCK) !=0);
         printf("nonblock flag is %sset (%s)\n",
                 (flf & O_NONBLOCK) ? "" : "not ",
                 flf_pass ? "OK" : "failed");
      
         close(acceptfd);
         close(connfd);
      
         printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL");
         return fdf_pass && flf_pass;
      }
      
      static int
      create_listening_socket(int port_num)
      {
         struct sockaddr_in svaddr;
         int lfd;
         int optval;
      
         memset(&svaddr, 0, sizeof(struct sockaddr_in));
         svaddr.sin_family = AF_INET;
         svaddr.sin_addr.s_addr = htonl(INADDR_ANY);
         svaddr.sin_port = htons(port_num);
      
         lfd = socket(AF_INET, SOCK_STREAM, 0);
         if (lfd == -1)
             die("socket");
      
         optval = 1;
         if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval,
                        sizeof(optval)) == -1)
             die("setsockopt");
      
         if (bind(lfd, (struct sockaddr *) &svaddr,
                  sizeof(struct sockaddr_in)) == -1)
             die("bind");
      
         if (listen(lfd, 5) == -1)
             die("listen");
      
         return lfd;
      }
      
      int
      main(int argc, char *argv[])
      {
         struct sockaddr_in conn_addr;
         int lfd;
         int port_num;
         int passed;
      
         passed = 1;
      
         port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM;
      
         memset(&conn_addr, 0, sizeof(struct sockaddr_in));
         conn_addr.sin_family = AF_INET;
         conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
         conn_addr.sin_port = htons(port_num);
      
         lfd = create_listening_socket(port_num);
      
         if (!do_test(lfd, &conn_addr, 0, 0))
             passed = 0;
         if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0))
             passed = 0;
         if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK))
             passed = 0;
         if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK))
             passed = 0;
      
         close(lfd);
      
         exit(passed ? EXIT_SUCCESS : EXIT_FAILURE);
      }
      
      [mtk.manpages@gmail.com: rewrote changelog, updated test program]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Tested-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de11defe
  35. 12 11月, 2008 1 次提交
  36. 25 7月, 2008 1 次提交
    • U
      flag parameters: paccept · aaca0bdc
      Ulrich Drepper 提交于
      This patch is by far the most complex in the series.  It adds a new syscall
      paccept.  This syscall differs from accept in that it adds (at the userlevel)
      two additional parameters:
      
      - a signal mask
      - a flags value
      
      The flags parameter can be used to set flag like SOCK_CLOEXEC.  This is
      imlpemented here as well.  Some people argued that this is a property which
      should be inherited from the file desriptor for the server but this is against
      POSIX.  Additionally, we really want the signal mask parameter as well
      (similar to pselect, ppoll, etc).  So an interface change in inevitable.
      
      The flag value is the same as for socket and socketpair.  I think diverging
      here will only create confusion.  Similar to the filesystem interfaces where
      the use of the O_* constants differs, it is acceptable here.
      
      The signal mask is handled as for pselect etc.  The mask is temporarily
      installed for the thread and removed before the call returns.  I modeled the
      code after pselect.  If there is a problem it's likely also in pselect.
      
      For architectures which use socketcall I maintained this interface instead of
      adding a system call.  The symmetry shouldn't be broken.
      
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      #include <errno.h>
      #include <fcntl.h>
      #include <pthread.h>
      #include <signal.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <netinet/in.h>
      #include <sys/socket.h>
      #include <sys/syscall.h>
      
      #ifndef __NR_paccept
      # ifdef __x86_64__
      #  define __NR_paccept 288
      # elif defined __i386__
      #  define SYS_PACCEPT 18
      #  define USE_SOCKETCALL 1
      # else
      #  error "need __NR_paccept"
      # endif
      #endif
      
      #ifdef USE_SOCKETCALL
      # define paccept(fd, addr, addrlen, mask, flags) \
        ({ long args[6] = { \
             (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
           syscall (__NR_socketcall, SYS_PACCEPT, args); })
      #else
      # define paccept(fd, addr, addrlen, mask, flags) \
        syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
      #endif
      
      #define PORT 57392
      
      #define SOCK_CLOEXEC O_CLOEXEC
      
      static pthread_barrier_t b;
      
      static void *
      tf (void *arg)
      {
        pthread_barrier_wait (&b);
        int s = socket (AF_INET, SOCK_STREAM, 0);
        struct sockaddr_in sin;
        sin.sin_family = AF_INET;
        sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
        sin.sin_port = htons (PORT);
        connect (s, (const struct sockaddr *) &sin, sizeof (sin));
        close (s);
      
        pthread_barrier_wait (&b);
        s = socket (AF_INET, SOCK_STREAM, 0);
        sin.sin_port = htons (PORT);
        connect (s, (const struct sockaddr *) &sin, sizeof (sin));
        close (s);
        pthread_barrier_wait (&b);
      
        pthread_barrier_wait (&b);
        sleep (2);
        pthread_kill ((pthread_t) arg, SIGUSR1);
      
        return NULL;
      }
      
      static void
      handler (int s)
      {
      }
      
      int
      main (void)
      {
        pthread_barrier_init (&b, NULL, 2);
      
        struct sockaddr_in sin;
        pthread_t th;
        if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
          {
            puts ("pthread_create failed");
            return 1;
          }
      
        int s = socket (AF_INET, SOCK_STREAM, 0);
        int reuse = 1;
        setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
        sin.sin_family = AF_INET;
        sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
        sin.sin_port = htons (PORT);
        bind (s, (struct sockaddr *) &sin, sizeof (sin));
        listen (s, SOMAXCONN);
      
        pthread_barrier_wait (&b);
      
        int s2 = paccept (s, NULL, 0, NULL, 0);
        if (s2 < 0)
          {
            puts ("paccept(0) failed");
            return 1;
          }
      
        int coe = fcntl (s2, F_GETFD);
        if (coe & FD_CLOEXEC)
          {
            puts ("paccept(0) set close-on-exec-flag");
            return 1;
          }
        close (s2);
      
        pthread_barrier_wait (&b);
      
        s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
        if (s2 < 0)
          {
            puts ("paccept(SOCK_CLOEXEC) failed");
            return 1;
          }
      
        coe = fcntl (s2, F_GETFD);
        if ((coe & FD_CLOEXEC) == 0)
          {
            puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
            return 1;
          }
        close (s2);
      
        pthread_barrier_wait (&b);
      
        struct sigaction sa;
        sa.sa_handler = handler;
        sa.sa_flags = 0;
        sigemptyset (&sa.sa_mask);
        sigaction (SIGUSR1, &sa, NULL);
      
        sigset_t ss;
        pthread_sigmask (SIG_SETMASK, NULL, &ss);
        sigaddset (&ss, SIGUSR1);
        pthread_sigmask (SIG_SETMASK, &ss, NULL);
      
        sigdelset (&ss, SIGUSR1);
        alarm (4);
        pthread_barrier_wait (&b);
      
        errno = 0 ;
        s2 = paccept (s, NULL, 0, &ss, 0);
        if (s2 != -1 || errno != EINTR)
          {
            puts ("paccept did not fail with EINTR");
            return 1;
          }
      
        close (s);
      
        puts ("OK");
      
        return 0;
      }
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      [akpm@linux-foundation.org: make it compile]
      [akpm@linux-foundation.org: add sys_ni stub]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aaca0bdc
反馈
建议
客服 返回
顶部