1. 11 10月, 2007 4 次提交
    • T
      [NET]: sanitize kernel_accept() error path · fa8705b0
      Tony Battersby 提交于
      If kernel_accept() returns an error, it may pass back a pointer to
      freed memory (which the caller should ignore).  Make it pass back NULL
      instead for better safety.
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa8705b0
    • S
      [NET]: sparse warning fixes · cfcabdcc
      Stephen Hemminger 提交于
      Fix a bunch of sparse warnings. Mostly about 0 used as
      NULL pointer, and shadowed variable declarations.
      One notable case was that hash size should have been unsigned.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfcabdcc
    • E
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman 提交于
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      881d966b
    • E
      [NET]: Make socket creation namespace safe. · 1b8d7ae4
      Eric W. Biederman 提交于
      This patch passes in the namespace a new socket should be created in
      and has the socket code do the appropriate reference counting.  By
      virtue of this all socket create methods are touched.  In addition
      the socket create methods are modified so that they will fail if
      you attempt to create a socket in a non-default network namespace.
      
      Failing if we attempt to create a socket outside of the default
      network namespace ensures that as we incrementally make the network stack
      network namespace aware we will not export functionality that someone
      has not audited and made certain is network namespace safe.
      Allowing us to partially enable network namespaces before all of the
      exotic protocols are supported.
      
      Any protocol layers I have missed will fail to compile because I now
      pass an extra parameter into the socket creation code.
      
      [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b8d7ae4
  2. 28 9月, 2007 1 次提交
    • D
      [NET]: Zero length write() on socket should not simply return 0. · e79ad711
      David S. Miller 提交于
      This fixes kernel bugzilla #5731
      
      It should generate an empty packet for datagram protocols when the
      socket is connected, for one.
      
      The check is doubly-wrong because all that a write() can be is a
      sendmsg() call with a NULL msg_control and a single entry iovec.  No
      special semantics should be assigned to it, therefore the zero length
      check should be removed entirely.
      
      This matches the behavior of BSD and several other systems.
      
      Alan Cox notes that SuSv3 says the behavior of a zero length write on
      non-files is "unspecified", but that's kind of useless since BSD has
      defined this behavior for a quarter century and BSD is essentially
      what application folks code to.
      
      Based upon a patch from Stephen Hemminger.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e79ad711
  3. 16 8月, 2007 1 次提交
  4. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  5. 17 7月, 2007 1 次提交
    • U
      O_CLOEXEC for SCM_RIGHTS · 4a19542e
      Ulrich Drepper 提交于
      Part two in the O_CLOEXEC saga: adding support for file descriptors received
      through Unix domain sockets.
      
      The patch is once again pretty minimal, it introduces a new flag for recvmsg
      and passes it just like the existing MSG_CMSG_COMPAT flag.  I think this bit
      is not used otherwise but the networking people will know better.
      
      This new flag is not recognized by recvfrom and recv.  These functions cannot
      be used for that purpose and the asymmetry this introduces is not worse than
      the already existing MSG_CMSG_COMPAT situations.
      
      The patch must be applied on the patch which introduced O_CLOEXEC.  It has to
      remove static from the new get_unused_fd_flags function but since scm.c cannot
      live in a module the function still hasn't to be exported.
      
      Here's a test program to make sure the code works.  It's so much longer than
      the actual patch...
      
      #include <errno.h>
      #include <error.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/socket.h>
      #include <sys/un.h>
      
      #ifndef O_CLOEXEC
      # define O_CLOEXEC 02000000
      #endif
      #ifndef MSG_CMSG_CLOEXEC
      # define MSG_CMSG_CLOEXEC 0x40000000
      #endif
      
      int
      main (int argc, char *argv[])
      {
        if (argc > 1)
          {
            int fd = atol (argv[1]);
            printf ("child: fd = %d\n", fd);
            if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
              {
                puts ("file descriptor valid in child");
                return 1;
              }
            return 0;
      
          }
      
        struct sockaddr_un sun;
        strcpy (sun.sun_path, "./testsocket");
        sun.sun_family = AF_UNIX;
      
        char databuf[] = "hello";
        struct iovec iov[1];
        iov[0].iov_base = databuf;
        iov[0].iov_len = sizeof (databuf);
      
        union
        {
          struct cmsghdr hdr;
          char bytes[CMSG_SPACE (sizeof (int))];
        } buf;
        struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
                              .msg_control = buf.bytes,
                              .msg_controllen = sizeof (buf) };
        struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);
      
        cmsg->cmsg_level = SOL_SOCKET;
        cmsg->cmsg_type = SCM_RIGHTS;
        cmsg->cmsg_len = CMSG_LEN (sizeof (int));
      
        msg.msg_controllen = cmsg->cmsg_len;
      
        pid_t child = fork ();
        if (child == -1)
          error (1, errno, "fork");
        if (child == 0)
          {
            int sock = socket (PF_UNIX, SOCK_STREAM, 0);
            if (sock < 0)
              error (1, errno, "socket");
      
            if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
              error (1, errno, "bind");
            if (listen (sock, SOMAXCONN) < 0)
              error (1, errno, "listen");
      
            int conn = accept (sock, NULL, NULL);
            if (conn == -1)
              error (1, errno, "accept");
      
            *(int *) CMSG_DATA (cmsg) = sock;
            if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
              error (1, errno, "sendmsg");
      
            return 0;
          }
      
        /* For a test suite this should be more robust like a
           barrier in shared memory.  */
        sleep (1);
      
        int sock = socket (PF_UNIX, SOCK_STREAM, 0);
        if (sock < 0)
          error (1, errno, "socket");
      
        if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
          error (1, errno, "connect");
        unlink (sun.sun_path);
      
        *(int *) CMSG_DATA (cmsg) = -1;
      
        if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
          error (1, errno, "recvmsg");
      
        int fd = *(int *) CMSG_DATA (cmsg);
        if (fd == -1)
          error (1, 0, "no descriptor received");
      
        char fdname[20];
        snprintf (fdname, sizeof (fdname), "%d", fd);
        execl ("/proc/self/exe", argv[0], fdname, NULL);
        puts ("execl failed");
        return 1;
      }
      
      [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michael Buesch <mb@bu3sch.de>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4a19542e
  6. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  7. 09 5月, 2007 1 次提交
    • E
      VFS: delay the dentry name generation on sockets and pipes · c23fbb6b
      Eric Dumazet 提交于
      1) Introduces a new method in 'struct dentry_operations'.  This method
         called d_dname() might be called from d_path() to build a pathname for
         special filesystems.  It is called without locks.
      
         Future patches (if we succeed in having one common dentry for all
         pipes/sockets) may need to change prototype of this method, but we now
         use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);
      
      2) Adds a dynamic_dname() helper function that eases d_dname() implementations
      
      3) Defines d_dname method for sockets : No more sprintf() at socket
         creation.  This is delayed up to the moment someone does an access to
         /proc/pid/fd/...
      
      4) Defines d_dname method for pipes : No more sprintf() at pipe
         creation.  This is delayed up to the moment someone does an access to
         /proc/pid/fd/...
      
      A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
      *nice* speedup on my Pentium(M) 1.6 Ghz :
      
      3.090 s instead of 3.450 s
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c23fbb6b
  8. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  9. 26 4月, 2007 3 次提交
  10. 27 3月, 2007 1 次提交
  11. 18 2月, 2007 1 次提交
    • A
      [PATCH] AUDIT_FD_PAIR · db349509
      Al Viro 提交于
      Provide an audit record of the descriptor pair returned by pipe() and
      socketpair().  Rewritten from the original posted to linux-audit by
      John D. Ramsdell <ramsdell@mitre.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      db349509
  12. 13 2月, 2007 1 次提交
  13. 11 2月, 2007 1 次提交
  14. 09 2月, 2007 2 次提交
    • D
      [NET]: Fix net/socket.c warnings. · 4387ff75
      David S. Miller 提交于
      GCC (correctly) says:
      
      net/socket.c: In function ‘sys_sendto’:
      net/socket.c:1510: warning: ‘err’ may be used uninitialized in this function
      net/socket.c: In function ‘sys_recvfrom’:
      net/socket.c:1571: warning: ‘err’ may be used uninitialized in this function
      
      sock_from_file() either returns filp->private_data or it
      sets *err and returns NULL.
      
      Callers return "err" on NULL, but filp->private_data could
      be NULL.
      
      Some minor rearrangements of error handling in sys_sendto
      and sys_recvfrom solves the issue.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4387ff75
    • E
      [NET]: cleanup sock_from_file() · 23bb80d2
      Eric Dumazet 提交于
      I believe dead code from sock_from_file() can be cleaned up.
      
      All sockets are now built using sock_attach_fd(), that puts the 'sock' pointer 
      into file->private_data and &socket_file_ops into file->f_op
      
      I could not find a place where file->private_data could be set to NULL, 
      keeping opened the file.
      
      So to get 'sock' from a 'file' pointer, either :
      
      - This is a socket file (f_op == &socket_file_ops), and we can directly get 
      'sock' from private_data.
      - This is not a socket, we return -ENOTSOCK and dont even try to find a socket 
      via dentry/inode :)
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23bb80d2
  15. 09 12月, 2006 1 次提交
  16. 08 12月, 2006 3 次提交
  17. 03 12月, 2006 1 次提交
  18. 02 10月, 2006 1 次提交
  19. 01 10月, 2006 2 次提交
  20. 23 9月, 2006 7 次提交
  21. 01 9月, 2006 1 次提交
  22. 01 7月, 2006 1 次提交
  23. 23 6月, 2006 1 次提交
    • D
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells 提交于
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      454e2398
  24. 01 5月, 2006 1 次提交
    • S
      [PATCH] sockaddr patch · d6fe3945
      Steve Grubb 提交于
      On Thursday 23 March 2006 09:08, John D. Ramsdell wrote:
      >  I noticed that a socketcall(bind) and socketcall(connect) event contain a
      >  record of type=SOCKADDR, but I cannot see one for a system call event
      >  associated with socketcall(accept).  Recording the sockaddr of an accepted
      >  socket is important for cross platform information flow analys
      
      Thanks for pointing this out. The following patch should address this.
      Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d6fe3945
  25. 20 4月, 2006 1 次提交