1. 25 8月, 2011 1 次提交
    • M
      sendmmsg/sendmsg: fix unsafe user pointer access · bc909d9d
      Mathieu Desnoyers 提交于
      Dereferencing a user pointer directly from kernel-space without going
      through the copy_from_user family of functions is a bad idea. Two of
      such usages can be found in the sendmsg code path called from sendmmsg,
      added by
      
      commit c71d8ebe upstream.
      commit 5b47b8038f183b44d2d8ff1c7d11a5c1be706b34 in the 3.0-stable tree.
      
      Usages are performed through memcmp() and memcpy() directly. Fix those
      by using the already copied msg_sys structure instead of the __user *msg
      structure. Note that msg_sys can be set to NULL by verify_compat_iovec()
      or verify_iovec(), which requires additional NULL pointer checks.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NDavid Goulet <dgoulet@ev0ke.net>
      CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      CC: Anton Blanchard <anton@samba.org>
      CC: David S. Miller <davem@davemloft.net>
      CC: stable <stable@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc909d9d
  2. 05 8月, 2011 3 次提交
    • T
      net: Fix security_socket_sendmsg() bypass problem. · c71d8ebe
      Tetsuo Handa 提交于
      The sendmmsg() introduced by commit 228e548e "net: Add sendmmsg socket system
      call" is capable of sending to multiple different destination addresses.
      
      SMACK is using destination's address for checking sendmsg() permission.
      However, security_socket_sendmsg() is called for only once even if multiple
      different destination addresses are passed to sendmmsg().
      
      Therefore, we need to call security_socket_sendmsg() for each destination
      address rather than only the first destination address.
      
      Since calling security_socket_sendmsg() every time when only single destination
      address was passed to sendmmsg() is a waste of time, omit calling
      security_socket_sendmsg() unless destination address of previous datagram and
      that of current datagram differs.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NAnton Blanchard <anton@samba.org>
      Cc: stable <stable@kernel.org> [3.0+]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c71d8ebe
    • A
      net: Cap number of elements for sendmmsg · 98382f41
      Anton Blanchard 提交于
      To limit the amount of time we can spend in sendmmsg, cap the
      number of elements to UIO_MAXIOV (currently 1024).
      
      For error handling an application using sendmmsg needs to retry at
      the first unsent message, so capping is simpler and requires less
      application logic than returning EINVAL.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable <stable@kernel.org> [3.0+]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98382f41
    • A
      net: sendmmsg should only return an error if no messages were sent · 728ffb86
      Anton Blanchard 提交于
      sendmmsg uses a similar error return strategy as recvmmsg but it
      turns out to be a confusing way to communicate errors.
      
      The current code stores the error code away and returns it on the next
      sendmmsg call. This means a call with completely valid arguments could
      get an error from a previous call.
      
      Change things so we only return an error if no datagrams could be sent.
      If less than the requested number of messages were sent, the application
      must retry starting at the first failed one and if the problem is
      persistent the error will be returned.
      
      This matches the behaviour of other syscalls like read/write - it
      is not an error if less than the requested number of elements are sent.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable <stable@kernel.org> [3.0+]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      728ffb86
  3. 28 7月, 2011 1 次提交
  4. 27 7月, 2011 1 次提交
  5. 18 5月, 2011 1 次提交
  6. 08 5月, 2011 1 次提交
  7. 06 5月, 2011 1 次提交
    • A
      net: Add sendmmsg socket system call · 228e548e
      Anton Blanchard 提交于
      This patch adds a multiple message send syscall and is the send
      version of the existing recvmmsg syscall. This is heavily
      based on the patch by Arnaldo that added recvmmsg.
      
      I wrote a microbenchmark to test the performance gains of using
      this new syscall:
      
      http://ozlabs.org/~anton/junkcode/sendmmsg_test.c
      
      The test was run on a ppc64 box with a 10 Gbit network card. The
      benchmark can send both UDP and RAW ethernet packets.
      
      64B UDP
      
      batch   pkts/sec
      1       804570
      2       872800 (+ 8 %)
      4       916556 (+14 %)
      8       939712 (+17 %)
      16      952688 (+18 %)
      32      956448 (+19 %)
      64      964800 (+20 %)
      
      64B raw socket
      
      batch   pkts/sec
      1       1201449
      2       1350028 (+12 %)
      4       1461416 (+22 %)
      8       1513080 (+26 %)
      16      1541216 (+28 %)
      32      1553440 (+29 %)
      64      1557888 (+30 %)
      
      We see a 20% improvement in throughput on UDP send and 30%
      on raw socket send.
      
      [ Add sparc syscall entries. -DaveM ]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      228e548e
  8. 12 4月, 2011 1 次提交
  9. 31 3月, 2011 1 次提交
  10. 19 3月, 2011 1 次提交
  11. 24 2月, 2011 1 次提交
  12. 23 2月, 2011 1 次提交
  13. 01 2月, 2011 2 次提交
    • G
      Revert "appletalk: move to staging" · 0ffbf8bf
      Greg Kroah-Hartman 提交于
      This reverts commit a6238f21
      
      Appletalk got some patches to fix up the BLK usage in it in the
      network tree, so this removal isn't needed.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: <acme@ghostprotocols.net>
      Cc: netdev@vger.kernel.org,
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      0ffbf8bf
    • A
      appletalk: move to staging · a6238f21
      Arnd Bergmann 提交于
      For all I know, Appletalk is dead, the only reasonable
      use right now would be nostalgia, and that can be served
      well enough by old kernels. The code is largely not
      in a bad shape, but it still uses the big kernel lock,
      and nobody seems motivated to change that.
      
      FWIW, the last release of MacOS that supported Appletalk
      was MacOS X 10.5, made in 2007, and it has been abandoned
      by Apple with 10.6. Using TCP/IP instead of Appletalk has
      been supported since MacOS 7.6, which was released in
      1997 and is able to run on most of the legacy hardware.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      a6238f21
  14. 13 1月, 2011 1 次提交
  15. 07 1月, 2011 5 次提交
    • N
      fs: scale mntget/mntput · b3e19d92
      Nick Piggin 提交于
      The problem that this patch aims to fix is vfsmount refcounting scalability.
      We need to take a reference on the vfsmount for every successful path lookup,
      which often go to the same mount point.
      
      The fundamental difficulty is that a "simple" reference count can never be made
      scalable, because any time a reference is dropped, we must check whether that
      was the last reference. To do that requires communication with all other CPUs
      that may have taken a reference count.
      
      We can make refcounts more scalable in a couple of ways, involving keeping
      distributed counters, and checking for the global-zero condition less
      frequently.
      
      - check the global sum once every interval (this will delay zero detection
        for some interval, so it's probably a showstopper for vfsmounts).
      
      - keep a local count and only taking the global sum when local reaches 0 (this
        is difficult for vfsmounts, because we can't hold preempt off for the life of
        a reference, so a counter would need to be per-thread or tied strongly to a
        particular CPU which requires more locking).
      
      - keep a local difference of increments and decrements, which allows us to sum
        the total difference and hence find the refcount when summing all CPUs. Then,
        keep a single integer "long" refcount for slow and long lasting references,
        and only take the global sum of local counters when the long refcount is 0.
      
      This last scheme is what I implemented here. Attached mounts and process root
      and working directory references are "long" references, and everything else is
      a short reference.
      
      This allows scalable vfsmount references during path walking over mounted
      subtrees and unattached (lazy umounted) mounts with processes still running
      in them.
      
      This results in one fewer atomic op in the fastpath: mntget is now just a
      per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
      and non-atomic decrement in the common case. However code is otherwise bigger
      and heavier, so single threaded performance is basically a wash.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      b3e19d92
    • N
      fs: improve scalability of pseudo filesystems · 4b936885
      Nick Piggin 提交于
      Regardless of how much we possibly try to scale dcache, there is likely
      always going to be some fundamental contention when adding or removing children
      under the same parent. Pseudo filesystems do not seem need to have connected
      dentries because by definition they are disconnected.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      4b936885
    • N
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin 提交于
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      
      Patched with:
      
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fb045adb
    • N
      fs: avoid inode RCU freeing for pseudo fs · ff0c7d15
      Nick Piggin 提交于
      Pseudo filesystems that don't put inode on RCU list or reachable by
      rcu-walk dentries do not need to RCU free their inodes.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      ff0c7d15
    • N
      fs: icache RCU free inodes · fa0d7e3d
      Nick Piggin 提交于
      RCU free the struct inode. This will allow:
      
      - Subsequent store-free path walking patch. The inode must be consulted for
        permissions when walking, so an RCU inode reference is a must.
      - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
        to take i_lock no longer need to take sb_inode_list_lock to walk the list in
        the first place. This will simplify and optimize locking.
      - Could remove some nested trylock loops in dcache code
      - Could potentially simplify things a bit in VM land. Do not need to take the
        page lock to follow page->mapping.
      
      The downsides of this is the performance cost of using RCU. In a simple
      creat/unlink microbenchmark, performance drops by about 10% due to inability to
      reuse cache-hot slab objects. As iterations increase and RCU freeing starts
      kicking over, this increases to about 20%.
      
      In cases where inode lifetimes are longer (ie. many inodes may be allocated
      during the average life span of a single inode), a lot of this cache reuse is
      not applicable, so the regression caused by this patch is smaller.
      
      The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
      however this adds some complexity to list walking and store-free path walking,
      so I prefer to implement this at a later date, if it is shown to be a win in
      real situations. I haven't found a regression in any non-micro benchmark so I
      doubt it will be a problem.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fa0d7e3d
  16. 11 12月, 2010 1 次提交
  17. 13 11月, 2010 1 次提交
  18. 31 10月, 2010 1 次提交
  19. 29 10月, 2010 1 次提交
  20. 26 10月, 2010 2 次提交
  21. 21 10月, 2010 1 次提交
  22. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  23. 02 10月, 2010 1 次提交
  24. 09 9月, 2010 1 次提交
    • N
      net: remove address space warnings in net/socket.c · fb8621bb
      Namhyung Kim 提交于
      Casts __kernel to __user pointer require __force markup, so add it. Also
      sock_get/setsockopt() takes @optval and/or @optlen arguments as user pointers
      but were taking kernel pointers, use new variables 'uoptval' and/or 'uoptlen'
      to fix it. These remove following warnings from sparse:
      
       net/socket.c:1922:46: warning: cast adds address space to expression (<asn:1>)
       net/socket.c:3061:61: warning: incorrect type in argument 4 (different address spaces)
       net/socket.c:3061:61:    expected char [noderef] <asn:1>*optval
       net/socket.c:3061:61:    got char *optval
       net/socket.c:3061:69: warning: incorrect type in argument 5 (different address spaces)
       net/socket.c:3061:69:    expected int [noderef] <asn:1>*optlen
       net/socket.c:3061:69:    got int *optlen
       net/socket.c:3063:67: warning: incorrect type in argument 4 (different address spaces)
       net/socket.c:3063:67:    expected char [noderef] <asn:1>*optval
       net/socket.c:3063:67:    got char *optval
       net/socket.c:3064:45: warning: incorrect type in argument 5 (different address spaces)
       net/socket.c:3064:45:    expected int [noderef] <asn:1>*optlen
       net/socket.c:3064:45:    got int *optlen
       net/socket.c:3078:61: warning: incorrect type in argument 4 (different address spaces)
       net/socket.c:3078:61:    expected char [noderef] <asn:1>*optval
       net/socket.c:3078:61:    got char *optval
       net/socket.c:3080:67: warning: incorrect type in argument 4 (different address spaces)
       net/socket.c:3080:67:    expected char [noderef] <asn:1>*optval
       net/socket.c:3080:67:    got char *optval
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fb8621bb
  25. 19 8月, 2010 1 次提交
  26. 19 7月, 2010 2 次提交
  27. 04 6月, 2010 1 次提交
    • E
      From abbffa2aa9bd6f8df16d0d0a102af677510d8b9a Mon Sep 17 00:00:00 2001 · c6d409cf
      Eric Dumazet 提交于
      From: Eric Dumazet <eric.dumazet@gmail.com>
      Date: Thu, 3 Jun 2010 04:29:41 +0000
      Subject: [PATCH 2/3] net: net/socket.c and net/compat.c cleanups
      
      cleanup patch, to match modern coding style.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ---
       net/compat.c |   47 ++++++++---------
       net/socket.c |  165 ++++++++++++++++++++++++++++------------------------------
       2 files changed, 102 insertions(+), 110 deletions(-)
      
      diff --git a/net/compat.c b/net/compat.c
      index 1cf7590..63d260e 100644
      --- a/net/compat.c
      +++ b/net/compat.c
      @@ -81,7 +81,7 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
       	int tot_len;
      
       	if (kern_msg->msg_namelen) {
      -		if (mode==VERIFY_READ) {
      +		if (mode == VERIFY_READ) {
       			int err = move_addr_to_kernel(kern_msg->msg_name,
       						      kern_msg->msg_namelen,
       						      kern_address);
      @@ -354,7 +354,7 @@ static int do_set_attach_filter(struct socket *sock, int level, int optname,
       static int do_set_sock_timeout(struct socket *sock, int level,
       		int optname, char __user *optval, unsigned int optlen)
       {
      -	struct compat_timeval __user *up = (struct compat_timeval __user *) optval;
      +	struct compat_timeval __user *up = (struct compat_timeval __user *)optval;
       	struct timeval ktime;
       	mm_segment_t old_fs;
       	int err;
      @@ -367,7 +367,7 @@ static int do_set_sock_timeout(struct socket *sock, int level,
       		return -EFAULT;
       	old_fs = get_fs();
       	set_fs(KERNEL_DS);
      -	err = sock_setsockopt(sock, level, optname, (char *) &ktime, sizeof(ktime));
      +	err = sock_setsockopt(sock, level, optname, (char *)&ktime, sizeof(ktime));
       	set_fs(old_fs);
      
       	return err;
      @@ -389,11 +389,10 @@ asmlinkage long compat_sys_setsockopt(int fd, int level, int optname,
       				char __user *optval, unsigned int optlen)
       {
       	int err;
      -	struct socket *sock;
      +	struct socket *sock = sockfd_lookup(fd, &err);
      
      -	if ((sock = sockfd_lookup(fd, &err))!=NULL)
      -	{
      -		err = security_socket_setsockopt(sock,level,optname);
      +	if (sock) {
      +		err = security_socket_setsockopt(sock, level, optname);
       		if (err) {
       			sockfd_put(sock);
       			return err;
      @@ -453,7 +452,7 @@ static int compat_sock_getsockopt(struct socket *sock, int level, int optname,
       int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
       {
       	struct compat_timeval __user *ctv =
      -			(struct compat_timeval __user*) userstamp;
      +			(struct compat_timeval __user *) userstamp;
       	int err = -ENOENT;
       	struct timeval tv;
      
      @@ -477,7 +476,7 @@ EXPORT_SYMBOL(compat_sock_get_timestamp);
       int compat_sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
       {
       	struct compat_timespec __user *ctv =
      -			(struct compat_timespec __user*) userstamp;
      +			(struct compat_timespec __user *) userstamp;
       	int err = -ENOENT;
       	struct timespec ts;
      
      @@ -502,12 +501,10 @@ asmlinkage long compat_sys_getsockopt(int fd, int level, int optname,
       				char __user *optval, int __user *optlen)
       {
       	int err;
      -	struct socket *sock;
      +	struct socket *sock = sockfd_lookup(fd, &err);
      
      -	if ((sock = sockfd_lookup(fd, &err))!=NULL)
      -	{
      -		err = security_socket_getsockopt(sock, level,
      -							   optname);
      +	if (sock) {
      +		err = security_socket_getsockopt(sock, level, optname);
       		if (err) {
       			sockfd_put(sock);
       			return err;
      @@ -557,7 +554,7 @@ struct compat_group_filter {
      
       int compat_mc_setsockopt(struct sock *sock, int level, int optname,
       	char __user *optval, unsigned int optlen,
      -	int (*setsockopt)(struct sock *,int,int,char __user *,unsigned int))
      +	int (*setsockopt)(struct sock *, int, int, char __user *, unsigned int))
       {
       	char __user	*koptval = optval;
       	int		koptlen = optlen;
      @@ -640,12 +637,11 @@ int compat_mc_setsockopt(struct sock *sock, int level, int optname,
       	}
       	return setsockopt(sock, level, optname, koptval, koptlen);
       }
      -
       EXPORT_SYMBOL(compat_mc_setsockopt);
      
       int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       	char __user *optval, int __user *optlen,
      -	int (*getsockopt)(struct sock *,int,int,char __user *,int __user *))
      +	int (*getsockopt)(struct sock *, int, int, char __user *, int __user *))
       {
       	struct compat_group_filter __user *gf32 = (void *)optval;
       	struct group_filter __user *kgf;
      @@ -681,7 +677,7 @@ int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       	    __put_user(interface, &kgf->gf_interface) ||
       	    __put_user(fmode, &kgf->gf_fmode) ||
       	    __put_user(numsrc, &kgf->gf_numsrc) ||
      -	    copy_in_user(&kgf->gf_group,&gf32->gf_group,sizeof(kgf->gf_group)))
      +	    copy_in_user(&kgf->gf_group, &gf32->gf_group, sizeof(kgf->gf_group)))
       		return -EFAULT;
      
       	err = getsockopt(sock, level, optname, (char __user *)kgf, koptlen);
      @@ -714,21 +710,22 @@ int compat_mc_getsockopt(struct sock *sock, int level, int optname,
       		copylen = numsrc * sizeof(gf32->gf_slist[0]);
       		if (copylen > klen)
       			copylen = klen;
      -	        if (copy_in_user(gf32->gf_slist, kgf->gf_slist, copylen))
      +		if (copy_in_user(gf32->gf_slist, kgf->gf_slist, copylen))
       			return -EFAULT;
       	}
       	return err;
       }
      -
       EXPORT_SYMBOL(compat_mc_getsockopt);
      
       /* Argument list sizes for compat_sys_socketcall */
       #define AL(x) ((x) * sizeof(u32))
      -static unsigned char nas[20]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
      -				AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
      -				AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
      -				AL(4),AL(5)};
      +static unsigned char nas[20] = {
      +	AL(0), AL(3), AL(3), AL(3), AL(2), AL(3),
      +	AL(3), AL(3), AL(4), AL(4), AL(4), AL(6),
      +	AL(6), AL(2), AL(5), AL(5), AL(3), AL(3),
      +	AL(4), AL(5)
      +};
       #undef AL
      
       asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg, unsigned flags)
      @@ -827,7 +824,7 @@ asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
       					  compat_ptr(a[4]), compat_ptr(a[5]));
       		break;
       	case SYS_SHUTDOWN:
      -		ret = sys_shutdown(a0,a1);
      +		ret = sys_shutdown(a0, a1);
       		break;
       	case SYS_SETSOCKOPT:
       		ret = compat_sys_setsockopt(a0, a1, a[2],
      diff --git a/net/socket.c b/net/socket.c
      index 367d547..b63c051 100644
      --- a/net/socket.c
      +++ b/net/socket.c
      @@ -124,7 +124,7 @@ static int sock_fasync(int fd, struct file *filp, int on);
       static ssize_t sock_sendpage(struct file *file, struct page *page,
       			     int offset, size_t size, loff_t *ppos, int more);
       static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
      -			        struct pipe_inode_info *pipe, size_t len,
      +				struct pipe_inode_info *pipe, size_t len,
       				unsigned int flags);
      
       /*
      @@ -162,7 +162,7 @@ static const struct net_proto_family *net_families[NPROTO] __read_mostly;
        *	Statistics counters of the socket lists
        */
      
      -static DEFINE_PER_CPU(int, sockets_in_use) = 0;
      +static DEFINE_PER_CPU(int, sockets_in_use);
      
       /*
        * Support routines.
      @@ -309,9 +309,9 @@ static int init_inodecache(void)
       }
      
       static const struct super_operations sockfs_ops = {
      -	.alloc_inode =	sock_alloc_inode,
      -	.destroy_inode =sock_destroy_inode,
      -	.statfs =	simple_statfs,
      +	.alloc_inode	= sock_alloc_inode,
      +	.destroy_inode	= sock_destroy_inode,
      +	.statfs		= simple_statfs,
       };
      
       static int sockfs_get_sb(struct file_system_type *fs_type,
      @@ -411,6 +411,7 @@ int sock_map_fd(struct socket *sock, int flags)
      
       	return fd;
       }
      +EXPORT_SYMBOL(sock_map_fd);
      
       static struct socket *sock_from_file(struct file *file, int *err)
       {
      @@ -422,7 +423,7 @@ static struct socket *sock_from_file(struct file *file, int *err)
       }
      
       /**
      - *	sockfd_lookup	- 	Go from a file number to its socket slot
      + *	sockfd_lookup - Go from a file number to its socket slot
        *	@fd: file handle
        *	@err: pointer to an error code return
        *
      @@ -450,6 +451,7 @@ struct socket *sockfd_lookup(int fd, int *err)
       		fput(file);
       	return sock;
       }
      +EXPORT_SYMBOL(sockfd_lookup);
      
       static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
       {
      @@ -540,6 +542,7 @@ void sock_release(struct socket *sock)
       	}
       	sock->file = NULL;
       }
      +EXPORT_SYMBOL(sock_release);
      
       int sock_tx_timestamp(struct msghdr *msg, struct sock *sk,
       		      union skb_shared_tx *shtx)
      @@ -586,6 +589,7 @@ int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
       		ret = wait_on_sync_kiocb(&iocb);
       	return ret;
       }
      +EXPORT_SYMBOL(sock_sendmsg);
      
       int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
       		   struct kvec *vec, size_t num, size_t size)
      @@ -604,6 +608,7 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
       	set_fs(oldfs);
       	return result;
       }
      +EXPORT_SYMBOL(kernel_sendmsg);
      
       static int ktime2ts(ktime_t kt, struct timespec *ts)
       {
      @@ -664,7 +669,6 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
       		put_cmsg(msg, SOL_SOCKET,
       			 SCM_TIMESTAMPING, sizeof(ts), &ts);
       }
      -
       EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
      
       inline void sock_recv_drops(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
      @@ -720,6 +724,7 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg,
       		ret = wait_on_sync_kiocb(&iocb);
       	return ret;
       }
      +EXPORT_SYMBOL(sock_recvmsg);
      
       static int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg,
       			      size_t size, int flags)
      @@ -752,6 +757,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
       	set_fs(oldfs);
       	return result;
       }
      +EXPORT_SYMBOL(kernel_recvmsg);
      
       static void sock_aio_dtor(struct kiocb *iocb)
       {
      @@ -774,7 +780,7 @@ static ssize_t sock_sendpage(struct file *file, struct page *page,
       }
      
       static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
      -			        struct pipe_inode_info *pipe, size_t len,
      +				struct pipe_inode_info *pipe, size_t len,
       				unsigned int flags)
       {
       	struct socket *sock = file->private_data;
      @@ -887,7 +893,7 @@ static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
        */
      
       static DEFINE_MUTEX(br_ioctl_mutex);
      -static int (*br_ioctl_hook) (struct net *, unsigned int cmd, void __user *arg) = NULL;
      +static int (*br_ioctl_hook) (struct net *, unsigned int cmd, void __user *arg);
      
       void brioctl_set(int (*hook) (struct net *, unsigned int, void __user *))
       {
      @@ -895,7 +901,6 @@ void brioctl_set(int (*hook) (struct net *, unsigned int, void __user *))
       	br_ioctl_hook = hook;
       	mutex_unlock(&br_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(brioctl_set);
      
       static DEFINE_MUTEX(vlan_ioctl_mutex);
      @@ -907,7 +912,6 @@ void vlan_ioctl_set(int (*hook) (struct net *, void __user *))
       	vlan_ioctl_hook = hook;
       	mutex_unlock(&vlan_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(vlan_ioctl_set);
      
       static DEFINE_MUTEX(dlci_ioctl_mutex);
      @@ -919,7 +923,6 @@ void dlci_ioctl_set(int (*hook) (unsigned int, void __user *))
       	dlci_ioctl_hook = hook;
       	mutex_unlock(&dlci_ioctl_mutex);
       }
      -
       EXPORT_SYMBOL(dlci_ioctl_set);
      
       static long sock_do_ioctl(struct net *net, struct socket *sock,
      @@ -1047,6 +1050,7 @@ out_release:
       	sock = NULL;
       	goto out;
       }
      +EXPORT_SYMBOL(sock_create_lite);
      
       /* No kernel lock held - perfect */
       static unsigned int sock_poll(struct file *file, poll_table *wait)
      @@ -1147,6 +1151,7 @@ call_kill:
       	rcu_read_unlock();
       	return 0;
       }
      +EXPORT_SYMBOL(sock_wake_async);
      
       static int __sock_create(struct net *net, int family, int type, int protocol,
       			 struct socket **res, int kern)
      @@ -1265,11 +1270,13 @@ int sock_create(int family, int type, int protocol, struct socket **res)
       {
       	return __sock_create(current->nsproxy->net_ns, family, type, protocol, res, 0);
       }
      +EXPORT_SYMBOL(sock_create);
      
       int sock_create_kern(int family, int type, int protocol, struct socket **res)
       {
       	return __sock_create(&init_net, family, type, protocol, res, 1);
       }
      +EXPORT_SYMBOL(sock_create_kern);
      
       SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
       {
      @@ -1474,7 +1481,8 @@ SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
       		goto out;
      
       	err = -ENFILE;
      -	if (!(newsock = sock_alloc()))
      +	newsock = sock_alloc();
      +	if (!newsock)
       		goto out_put;
      
       	newsock->type = sock->type;
      @@ -1861,8 +1869,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned, flags)
       	if (MSG_CMSG_COMPAT & flags) {
       		if (get_compat_msghdr(&msg_sys, msg_compat))
       			return -EFAULT;
      -	}
      -	else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
      +	} else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
       		return -EFAULT;
      
       	sock = sockfd_lookup_light(fd, &err, &fput_needed);
      @@ -1964,8 +1971,7 @@ static int __sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
       	if (MSG_CMSG_COMPAT & flags) {
       		if (get_compat_msghdr(msg_sys, msg_compat))
       			return -EFAULT;
      -	}
      -	else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
      +	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
       		return -EFAULT;
      
       	err = -EMSGSIZE;
      @@ -2191,10 +2197,10 @@ SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg,
       /* Argument list sizes for sys_socketcall */
       #define AL(x) ((x) * sizeof(unsigned long))
       static const unsigned char nargs[20] = {
      -	AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
      -	AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
      -	AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
      -	AL(4),AL(5)
      +	AL(0), AL(3), AL(3), AL(3), AL(2), AL(3),
      +	AL(3), AL(3), AL(4), AL(4), AL(4), AL(6),
      +	AL(6), AL(2), AL(5), AL(5), AL(3), AL(3),
      +	AL(4), AL(5)
       };
      
       #undef AL
      @@ -2340,6 +2346,7 @@ int sock_register(const struct net_proto_family *ops)
       	printk(KERN_INFO "NET: Registered protocol family %d\n", ops->family);
       	return err;
       }
      +EXPORT_SYMBOL(sock_register);
      
       /**
        *	sock_unregister - remove a protocol handler
      @@ -2366,6 +2373,7 @@ void sock_unregister(int family)
      
       	printk(KERN_INFO "NET: Unregistered protocol family %d\n", family);
       }
      +EXPORT_SYMBOL(sock_unregister);
      
       static int __init sock_init(void)
       {
      @@ -2490,13 +2498,13 @@ static int dev_ifconf(struct net *net, struct compat_ifconf __user *uifc32)
       		ifc.ifc_req = NULL;
       		uifc = compat_alloc_user_space(sizeof(struct ifconf));
       	} else {
      -		size_t len =((ifc32.ifc_len / sizeof (struct compat_ifreq)) + 1) *
      -			sizeof (struct ifreq);
      +		size_t len = ((ifc32.ifc_len / sizeof(struct compat_ifreq)) + 1) *
      +			sizeof(struct ifreq);
       		uifc = compat_alloc_user_space(sizeof(struct ifconf) + len);
       		ifc.ifc_len = len;
       		ifr = ifc.ifc_req = (void __user *)(uifc + 1);
       		ifr32 = compat_ptr(ifc32.ifcbuf);
      -		for (i = 0; i < ifc32.ifc_len; i += sizeof (struct compat_ifreq)) {
      +		for (i = 0; i < ifc32.ifc_len; i += sizeof(struct compat_ifreq)) {
       			if (copy_in_user(ifr, ifr32, sizeof(struct compat_ifreq)))
       				return -EFAULT;
       			ifr++;
      @@ -2516,9 +2524,9 @@ static int dev_ifconf(struct net *net, struct compat_ifconf __user *uifc32)
       	ifr = ifc.ifc_req;
       	ifr32 = compat_ptr(ifc32.ifcbuf);
       	for (i = 0, j = 0;
      -             i + sizeof (struct compat_ifreq) <= ifc32.ifc_len && j < ifc.ifc_len;
      -	     i += sizeof (struct compat_ifreq), j += sizeof (struct ifreq)) {
      -		if (copy_in_user(ifr32, ifr, sizeof (struct compat_ifreq)))
      +	     i + sizeof(struct compat_ifreq) <= ifc32.ifc_len && j < ifc.ifc_len;
      +	     i += sizeof(struct compat_ifreq), j += sizeof(struct ifreq)) {
      +		if (copy_in_user(ifr32, ifr, sizeof(struct compat_ifreq)))
       			return -EFAULT;
       		ifr32++;
       		ifr++;
      @@ -2567,7 +2575,7 @@ static int compat_siocwandev(struct net *net, struct compat_ifreq __user *uifr32
       	compat_uptr_t uptr32;
       	struct ifreq __user *uifr;
      
      -	uifr = compat_alloc_user_space(sizeof (*uifr));
      +	uifr = compat_alloc_user_space(sizeof(*uifr));
       	if (copy_in_user(uifr, uifr32, sizeof(struct compat_ifreq)))
       		return -EFAULT;
      
      @@ -2601,9 +2609,9 @@ static int bond_ioctl(struct net *net, unsigned int cmd,
       			return -EFAULT;
      
       		old_fs = get_fs();
      -		set_fs (KERNEL_DS);
      +		set_fs(KERNEL_DS);
       		err = dev_ioctl(net, cmd, &kifr);
      -		set_fs (old_fs);
      +		set_fs(old_fs);
      
       		return err;
       	case SIOCBONDSLAVEINFOQUERY:
      @@ -2710,9 +2718,9 @@ static int compat_sioc_ifmap(struct net *net, unsigned int cmd,
       		return -EFAULT;
      
       	old_fs = get_fs();
      -	set_fs (KERNEL_DS);
      +	set_fs(KERNEL_DS);
       	err = dev_ioctl(net, cmd, (void __user *)&ifr);
      -	set_fs (old_fs);
      +	set_fs(old_fs);
      
       	if (cmd == SIOCGIFMAP && !err) {
       		err = copy_to_user(uifr32, &ifr, sizeof(ifr.ifr_name));
      @@ -2734,7 +2742,7 @@ static int compat_siocshwtstamp(struct net *net, struct compat_ifreq __user *uif
       	compat_uptr_t uptr32;
       	struct ifreq __user *uifr;
      
      -	uifr = compat_alloc_user_space(sizeof (*uifr));
      +	uifr = compat_alloc_user_space(sizeof(*uifr));
       	if (copy_in_user(uifr, uifr32, sizeof(struct compat_ifreq)))
       		return -EFAULT;
      
      @@ -2750,20 +2758,20 @@ static int compat_siocshwtstamp(struct net *net, struct compat_ifreq __user *uif
       }
      
       struct rtentry32 {
      -	u32   		rt_pad1;
      +	u32		rt_pad1;
       	struct sockaddr rt_dst;         /* target address               */
       	struct sockaddr rt_gateway;     /* gateway addr (RTF_GATEWAY)   */
       	struct sockaddr rt_genmask;     /* target network mask (IP)     */
      -	unsigned short  rt_flags;
      -	short           rt_pad2;
      -	u32   		rt_pad3;
      -	unsigned char   rt_tos;
      -	unsigned char   rt_class;
      -	short           rt_pad4;
      -	short           rt_metric;      /* +1 for binary compatibility! */
      +	unsigned short	rt_flags;
      +	short		rt_pad2;
      +	u32		rt_pad3;
      +	unsigned char	rt_tos;
      +	unsigned char	rt_class;
      +	short		rt_pad4;
      +	short		rt_metric;      /* +1 for binary compatibility! */
       	/* char * */ u32 rt_dev;        /* forcing the device at add    */
      -	u32   		rt_mtu;         /* per route MTU/Window         */
      -	u32   		rt_window;      /* Window clamping              */
      +	u32		rt_mtu;         /* per route MTU/Window         */
      +	u32		rt_window;      /* Window clamping              */
       	unsigned short  rt_irtt;        /* Initial RTT                  */
       };
      
      @@ -2793,29 +2801,29 @@ static int routing_ioctl(struct net *net, struct socket *sock,
      
       	if (sock && sock->sk && sock->sk->sk_family == AF_INET6) { /* ipv6 */
       		struct in6_rtmsg32 __user *ur6 = argp;
      -		ret = copy_from_user (&r6.rtmsg_dst, &(ur6->rtmsg_dst),
      +		ret = copy_from_user(&r6.rtmsg_dst, &(ur6->rtmsg_dst),
       			3 * sizeof(struct in6_addr));
      -		ret |= __get_user (r6.rtmsg_type, &(ur6->rtmsg_type));
      -		ret |= __get_user (r6.rtmsg_dst_len, &(ur6->rtmsg_dst_len));
      -		ret |= __get_user (r6.rtmsg_src_len, &(ur6->rtmsg_src_len));
      -		ret |= __get_user (r6.rtmsg_metric, &(ur6->rtmsg_metric));
      -		ret |= __get_user (r6.rtmsg_info, &(ur6->rtmsg_info));
      -		ret |= __get_user (r6.rtmsg_flags, &(ur6->rtmsg_flags));
      -		ret |= __get_user (r6.rtmsg_ifindex, &(ur6->rtmsg_ifindex));
      +		ret |= __get_user(r6.rtmsg_type, &(ur6->rtmsg_type));
      +		ret |= __get_user(r6.rtmsg_dst_len, &(ur6->rtmsg_dst_len));
      +		ret |= __get_user(r6.rtmsg_src_len, &(ur6->rtmsg_src_len));
      +		ret |= __get_user(r6.rtmsg_metric, &(ur6->rtmsg_metric));
      +		ret |= __get_user(r6.rtmsg_info, &(ur6->rtmsg_info));
      +		ret |= __get_user(r6.rtmsg_flags, &(ur6->rtmsg_flags));
      +		ret |= __get_user(r6.rtmsg_ifindex, &(ur6->rtmsg_ifindex));
      
       		r = (void *) &r6;
       	} else { /* ipv4 */
       		struct rtentry32 __user *ur4 = argp;
      -		ret = copy_from_user (&r4.rt_dst, &(ur4->rt_dst),
      +		ret = copy_from_user(&r4.rt_dst, &(ur4->rt_dst),
       					3 * sizeof(struct sockaddr));
      -		ret |= __get_user (r4.rt_flags, &(ur4->rt_flags));
      -		ret |= __get_user (r4.rt_metric, &(ur4->rt_metric));
      -		ret |= __get_user (r4.rt_mtu, &(ur4->rt_mtu));
      -		ret |= __get_user (r4.rt_window, &(ur4->rt_window));
      -		ret |= __get_user (r4.rt_irtt, &(ur4->rt_irtt));
      -		ret |= __get_user (rtdev, &(ur4->rt_dev));
      +		ret |= __get_user(r4.rt_flags, &(ur4->rt_flags));
      +		ret |= __get_user(r4.rt_metric, &(ur4->rt_metric));
      +		ret |= __get_user(r4.rt_mtu, &(ur4->rt_mtu));
      +		ret |= __get_user(r4.rt_window, &(ur4->rt_window));
      +		ret |= __get_user(r4.rt_irtt, &(ur4->rt_irtt));
      +		ret |= __get_user(rtdev, &(ur4->rt_dev));
       		if (rtdev) {
      -			ret |= copy_from_user (devname, compat_ptr(rtdev), 15);
      +			ret |= copy_from_user(devname, compat_ptr(rtdev), 15);
       			r4.rt_dev = devname; devname[15] = 0;
       		} else
       			r4.rt_dev = NULL;
      @@ -2828,9 +2836,9 @@ static int routing_ioctl(struct net *net, struct socket *sock,
       		goto out;
       	}
      
      -	set_fs (KERNEL_DS);
      +	set_fs(KERNEL_DS);
       	ret = sock_do_ioctl(net, sock, cmd, (unsigned long) r);
      -	set_fs (old_fs);
      +	set_fs(old_fs);
      
       out:
       	return ret;
      @@ -2993,11 +3001,13 @@ int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
       {
       	return sock->ops->bind(sock, addr, addrlen);
       }
      +EXPORT_SYMBOL(kernel_bind);
      
       int kernel_listen(struct socket *sock, int backlog)
       {
       	return sock->ops->listen(sock, backlog);
       }
      +EXPORT_SYMBOL(kernel_listen);
      
       int kernel_accept(struct socket *sock, struct socket **newsock, int flags)
       {
      @@ -3022,24 +3032,28 @@ int kernel_accept(struct socket *sock, struct socket **newsock, int flags)
       done:
       	return err;
       }
      +EXPORT_SYMBOL(kernel_accept);
      
       int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
       		   int flags)
       {
       	return sock->ops->connect(sock, addr, addrlen, flags);
       }
      +EXPORT_SYMBOL(kernel_connect);
      
       int kernel_getsockname(struct socket *sock, struct sockaddr *addr,
       			 int *addrlen)
       {
       	return sock->ops->getname(sock, addr, addrlen, 0);
       }
      +EXPORT_SYMBOL(kernel_getsockname);
      
       int kernel_getpeername(struct socket *sock, struct sockaddr *addr,
       			 int *addrlen)
       {
       	return sock->ops->getname(sock, addr, addrlen, 1);
       }
      +EXPORT_SYMBOL(kernel_getpeername);
      
       int kernel_getsockopt(struct socket *sock, int level, int optname,
       			char *optval, int *optlen)
      @@ -3056,6 +3070,7 @@ int kernel_getsockopt(struct socket *sock, int level, int optname,
       	set_fs(oldfs);
       	return err;
       }
      +EXPORT_SYMBOL(kernel_getsockopt);
      
       int kernel_setsockopt(struct socket *sock, int level, int optname,
       			char *optval, unsigned int optlen)
      @@ -3072,6 +3087,7 @@ int kernel_setsockopt(struct socket *sock, int level, int optname,
       	set_fs(oldfs);
       	return err;
       }
      +EXPORT_SYMBOL(kernel_setsockopt);
      
       int kernel_sendpage(struct socket *sock, struct page *page, int offset,
       		    size_t size, int flags)
      @@ -3083,6 +3099,7 @@ int kernel_sendpage(struct socket *sock, struct page *page, int offset,
      
       	return sock_no_sendpage(sock, page, offset, size, flags);
       }
      +EXPORT_SYMBOL(kernel_sendpage);
      
       int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg)
       {
      @@ -3095,33 +3112,11 @@ int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg)
      
       	return err;
       }
      +EXPORT_SYMBOL(kernel_sock_ioctl);
      
       int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
       {
       	return sock->ops->shutdown(sock, how);
       }
      -
      -EXPORT_SYMBOL(sock_create);
      -EXPORT_SYMBOL(sock_create_kern);
      -EXPORT_SYMBOL(sock_create_lite);
      -EXPORT_SYMBOL(sock_map_fd);
      -EXPORT_SYMBOL(sock_recvmsg);
      -EXPORT_SYMBOL(sock_register);
      -EXPORT_SYMBOL(sock_release);
      -EXPORT_SYMBOL(sock_sendmsg);
      -EXPORT_SYMBOL(sock_unregister);
      -EXPORT_SYMBOL(sock_wake_async);
      -EXPORT_SYMBOL(sockfd_lookup);
      -EXPORT_SYMBOL(kernel_sendmsg);
      -EXPORT_SYMBOL(kernel_recvmsg);
      -EXPORT_SYMBOL(kernel_bind);
      -EXPORT_SYMBOL(kernel_listen);
      -EXPORT_SYMBOL(kernel_accept);
      -EXPORT_SYMBOL(kernel_connect);
      -EXPORT_SYMBOL(kernel_getsockname);
      -EXPORT_SYMBOL(kernel_getpeername);
      -EXPORT_SYMBOL(kernel_getsockopt);
      -EXPORT_SYMBOL(kernel_setsockopt);
      -EXPORT_SYMBOL(kernel_sendpage);
      -EXPORT_SYMBOL(kernel_sock_ioctl);
       EXPORT_SYMBOL(kernel_sock_shutdown);
      +
      --
      1.7.0.4
      c6d409cf
  28. 24 5月, 2010 1 次提交
    • H
      cls_cgroup: Store classid in struct sock · f8451725
      Herbert Xu 提交于
      Up until now cls_cgroup has relied on fetching the classid out of
      the current executing thread.  This runs into trouble when a packet
      processing is delayed in which case it may execute out of another
      thread's context.
      
      Furthermore, even when a packet is not delayed we may fail to
      classify it if soft IRQs have been disabled, because this scenario
      is indistinguishable from one where a packet unrelated to the
      current thread is processed by a real soft IRQ.
      
      In fact, the current semantics is inherently broken, as a single
      skb may be constructed out of the writes of two different tasks.
      A different manifestation of this problem is when the TCP stack
      transmits in response of an incoming ACK.  This is currently
      unclassified.
      
      As we already have a concept of packet ownership for accounting
      purposes in the skb->sk pointer, this is a natural place to store
      the classid in a persistent manner.
      
      This patch adds the cls_cgroup classid in struct sock, filling up
      an existing hole on 64-bit :)
      
      The value is set at socket creation time.  So all sockets created
      via socket(2) automatically gains the ID of the thread creating it.
      Whenever another process touches the socket by either reading or
      writing to it, we will change the socket classid to that of the
      process if it has a valid (non-zero) classid.
      
      For sockets created on inbound connections through accept(2), we
      inherit the classid of the original listening socket through
      sk_clone, possibly preceding the actual accept(2) call.
      
      In order to minimise risks, I have not made this the authoritative
      classid.  For now it is only used as a backup when we execute
      with soft IRQs disabled.  Once we're completely happy with its
      semantics we can use it as the sole classid.
      
      Footnote: I have rearranged the error path on cls_group module
      creation.  If we didn't do this, then there is a window where
      someone could create a tc rule using cls_group before the cgroup
      subsystem has been registered.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8451725
  29. 18 5月, 2010 1 次提交
  30. 02 5月, 2010 1 次提交
    • E
      net: sock_def_readable() and friends RCU conversion · 43815482
      Eric Dumazet 提交于
      sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
      need two atomic operations (and associated dirtying) per incoming
      packet.
      
      RCU conversion is pretty much needed :
      
      1) Add a new structure, called "struct socket_wq" to hold all fields
      that will need rcu_read_lock() protection (currently: a
      wait_queue_head_t and a struct fasync_struct pointer).
      
      [Future patch will add a list anchor for wakeup coalescing]
      
      2) Attach one of such structure to each "struct socket" created in
      sock_alloc_inode().
      
      3) Respect RCU grace period when freeing a "struct socket_wq"
      
      4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
      socket_wq"
      
      5) Change sk_sleep() function to use new sk->sk_wq instead of
      sk->sk_sleep
      
      6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
      a rcu_read_lock() section.
      
      7) Change all sk_has_sleeper() callers to :
        - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
        - Use wq_has_sleeper() to eventually wakeup tasks.
        - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)
      
      8) sock_wake_async() is modified to use rcu protection as well.
      
      9) Exceptions :
        macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
      instead of dynamically allocated ones. They dont need rcu freeing.
      
      Some cleanups or followups are probably needed, (possible
      sk_callback_lock conversion to a spinlock for example...).
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43815482
  31. 01 5月, 2010 1 次提交