1. 11 10月, 2007 6 次提交
    • J
      [ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctls · 3ae7c0b2
      Jeff Garzik 提交于
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ae7c0b2
    • S
      [NET] netconsole: Support dynamic reconfiguration using configfs · 0bcc1816
      Satyam Sharma 提交于
      Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>.
      
      This patch introduces support for dynamic reconfiguration (adding, removing
      and/or modifying parameters of netconsole targets at runtime) using a
      userspace interface exported via configfs.  Documentation is also updated
      accordingly.
      
      Issues and brief design overview:
      
      (1) Kernel-initiated creation / destruction of kernel objects is not
          possible with configfs -- the lifetimes of the "config items" is managed
          exclusively from userspace.  But netconsole must support boot/module
          params too, and these are parsed in kernel and hence netpolls must be
          setup from the kernel.  Joel Becker suggested to separately manage the
          lifetimes of the two kinds of netconsole_target objects -- those created
          via configfs mkdir(2) from userspace and those specified from the
          boot/module option string.  This adds complexity and some redundancy here
          and also means that boot/module param-created targets are not exposed
          through the configfs namespace (and hence cannot be updated / destroyed
          dynamically).  However, this saves us from locking / refcounting
          complexities that would need to be introduced in configfs to support
          kernel-initiated item creation / destroy there.
      
      (2) In configfs, item creation takes place in the call chain of the
          mkdir(2) syscall in the driver subsystem.  If we used an ioctl(2) to
          create / destroy objects from userspace, the special userspace program is
          able to fill out the structure to be passed into the ioctl and hence
          specify attributes such as local interface that are required at the time
          we set up the netpoll.  For configfs, this information is not available at
          the time of mkdir(2).  So, we keep all newly-created targets (via
          configfs) disabled by default.  The user is expected to set various
          attributes appropriately (including the local network interface if
          required) and then write(2) "1" to the "enabled" attribute.  Thus,
          netpoll_setup() is then called on the set parameters in the context of
          _this_ write(2) on the "enabled" attribute itself.  This design enables
          the user to reconfigure existing netconsole targets at runtime to be
          attached to newly-come-up interfaces that may not have existed when
          netconsole was loaded or when the targets were actually created.  All this
          effectively enables us to get rid of custom ioctls.
      
      (3) Ultra-paranoid configfs attribute show() and store() operations, with
          sanity and input range checking, using only safe string primitives, and
          compliant with the recommendations in Documentation/filesystems/sysfs.txt.
      
      (4) A new function netpoll_print_options() is created in the netpoll API,
          that just prints out the configured parameters for a netpoll structure.
          netpoll_parse_options() is modified to use that and it is also exported to
          be used from netconsole.
      Signed-off-by: NSatyam Sharma <satyam@infradead.org>
      Acked-by: NKeiichi Kii <k-keiichi@bx.jp.nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0bcc1816
    • T
      [NEIGH]: Netlink notifications · d961db35
      Thomas Graf 提交于
      Currently neighbour event notifications are limited to update
      notifications and only sent if the ARP daemon is enabled. This
      patch extends the existing notification code by also reporting
      neighbours being removed due to gc or administratively and
      removes the dependency on the ARP daemon. This allows to keep
      track of neighbour states without periodically fetching the
      complete neighbour table.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d961db35
    • T
      [NEIGH]: Combine neighbour cleanup and release · 4f494554
      Thomas Graf 提交于
      Introduces neigh_cleanup_and_release() to be used after a
      neighbour has been removed from its neighbour table. Serves
      as preparation to add event notifications.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f494554
    • P
      [RTNETLINK]: Introduce generic rtnl_create_link(). · e7199288
      Pavel Emelianov 提交于
      This routine gets the parsed rtnl attributes and creates a new
      link with generic info (IFLA_LINKINFO policy). Its intention
      is to help the drivers, that need to create several links at
      once (like VETH).
      
      This is nothing but a copy-paste-ed part of rtnl_newlink() function
      that is responsible for creation of new device.
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7199288
    • S
      [NET]: Make NAPI polling independent of struct net_device objects. · bea3348e
      Stephen Hemminger 提交于
      Several devices have multiple independant RX queues per net
      device, and some have a single interrupt doorbell for several
      queues.
      
      In either case, it's easier to support layouts like that if the
      structure representing the poll is independant from the net
      device itself.
      
      The signature of the ->poll() call back goes from:
      
      	int foo_poll(struct net_device *dev, int *budget)
      
      to
      
      	int foo_poll(struct napi_struct *napi, int budget)
      
      The caller is returned the number of RX packets processed (or
      the number of "NAPI credits" consumed if you want to get
      abstract).  The callee no longer messes around bumping
      dev->quota, *budget, etc. because that is all handled in the
      caller upon return.
      
      The napi_struct is to be embedded in the device driver private data
      structures.
      
      Furthermore, it is the driver's responsibility to disable all NAPI
      instances in it's ->stop() device close handler.  Since the
      napi_struct is privatized into the driver's private data structures,
      only the driver knows how to get at all of the napi_struct instances
      it may have per-device.
      
      With lots of help and suggestions from Rusty Russell, Roland Dreier,
      Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
      
      Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
      Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
      
      [ Ported to current tree and all drivers converted.  Integrated
        Stephen's follow-on kerneldoc additions, and restored poll_list
        handling to the old style to fix mutual exclusion issues.  -DaveM ]
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bea3348e
  2. 17 9月, 2007 1 次提交
  3. 15 9月, 2007 1 次提交
    • D
      [NET]: Fix two issues wrt. SO_BINDTODEVICE. · 4878809f
      David S. Miller 提交于
      1) Comments suggest that setting optlen to zero will unbind
         the socket from whatever device it might be attached to.  This
         hasn't been the case since at least 2.2.x because the first thing
         this function does is return -EINVAL if 'optlen' is less than
         sizeof(int).
      
         This check also means that passing in a two byte string doesn't
         work so well.  It's almost as if this code was testing with "eth?"
         patterned strings and nothing else :-)
      
         Fix this by breaking the logic of this facility out into a
         seperate function which validates optlen more appropriately.
      
         The optlen==0 and small string cases now work properly.
      
      2) We should reset the cached route of the socket after we have made
         the device binding changes, not before.
      
      Reported by Ben Greear.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4878809f
  4. 11 9月, 2007 1 次提交
  5. 31 8月, 2007 1 次提交
  6. 29 8月, 2007 1 次提交
  7. 27 8月, 2007 2 次提交
  8. 15 8月, 2007 1 次提交
  9. 14 8月, 2007 1 次提交
  10. 08 8月, 2007 1 次提交
  11. 01 8月, 2007 3 次提交
  12. 31 7月, 2007 6 次提交
  13. 22 7月, 2007 1 次提交
  14. 21 7月, 2007 1 次提交
  15. 20 7月, 2007 2 次提交
  16. 19 7月, 2007 1 次提交
  17. 18 7月, 2007 5 次提交
    • D
    • D
      [NET]: merge dev_unicast_discard and dev_mc_discard into one · 26cc2522
      Denis Cheng 提交于
      this two functions could share the dev->_xmit_lock acquired context.
      Signed-off-by: NDenis Cheng <crquan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26cc2522
    • D
      [NET]: move dev_mc_discard from dev_mcast.c to dev.c · 456ad75c
      Denis Cheng 提交于
      Because this function is only called by unregister_netdevice,
      this moving could make this non-global function static,
      and also remove its declaration in netdevice.h;
      
      Any further, function __dev_addr_discard is also just called by
      dev_mc_discard and dev_unicast_discard, keeping this two functions
      both in one c file could make __dev_addr_discard also static
      and remove its declaration in netdevice.h;
      
      Futhermore, the sequential call to dev_unicast_discard and then
      dev_mc_discard in unregister_netdevice have a similar mechanism that:
      (netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh),
      they should merged into one to eliminate duplicates in acquiring and
      releasing the dev->_xmit_lock, this would be done in my following patch.
      Signed-off-by: NDenis Cheng <crquan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      456ad75c
    • R
      [NET]: gen_estimator deadlock fix · 0929c2dd
      Ranko Zivojnovic 提交于
      -Fixes ABBA deadlock noted by Patrick McHardy <kaber@trash.net>:
      
      > There is at least one ABBA deadlock, est_timer() does:
      > read_lock(&est_lock)
      > spin_lock(e->stats_lock) (which is dev->queue_lock)
      >
      > and qdisc_destroy calls htb_destroy under dev->queue_lock, which
      > calls htb_destroy_class, then gen_kill_estimator and this
      > write_locks est_lock.
      
      To fix the ABBA deadlock the rate estimators are now kept on an rcu list.
      
      -The est_lock changes the use from protecting the list to protecting
      the update to the 'bstat' pointer in order to avoid NULL dereferencing.
      
      -The 'interval' member of the gen_estimator structure removed as it is
      not needed.
      Signed-off-by: NRanko Zivojnovic <ranko@spidernet.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0929c2dd
    • R
      Freezer: make kernel threads nonfreezable by default · 83144186
      Rafael J. Wysocki 提交于
      Currently, the freezer treats all tasks as freezable, except for the kernel
      threads that explicitly set the PF_NOFREEZE flag for themselves.  This
      approach is problematic, since it requires every kernel thread to either
      set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
      care for the freezing of tasks at all.
      
      It seems better to only require the kernel threads that want to or need to
      be frozen to use some freezer-related code and to remove any
      freezer-related code from the other (nonfreezable) kernel threads, which is
      done in this patch.
      
      The patch causes all kernel threads to be nonfreezable by default (ie.  to
      have PF_NOFREEZE set by default) and introduces the set_freezable()
      function that should be called by the freezable kernel threads in order to
      unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
      threads call set_freezable(), so it shouldn't cause any (intentional)
      change of behaviour to appear.  Additionally, it updates documentation to
      describe the freezing of tasks more accurately.
      
      [akpm@linux-foundation.org: build fixes]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@nigel.suspend2.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83144186
  18. 17 7月, 2007 2 次提交
    • L
      Revert "[NET]: Fix races in net_rx_action vs netpoll." · 2e27afb3
      Linus Torvalds 提交于
      This reverts commit 29578624.
      
      Ingo Molnar reports complete breakage with his e1000 card (no
      networking, card reports transmit timeouts), and bisected it down to
      this commit.  Let's figure out what went wrong, but not keep breaking
      machines until we do.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Olaf Kirch <olaf.kirch@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e27afb3
    • U
      O_CLOEXEC for SCM_RIGHTS · 4a19542e
      Ulrich Drepper 提交于
      Part two in the O_CLOEXEC saga: adding support for file descriptors received
      through Unix domain sockets.
      
      The patch is once again pretty minimal, it introduces a new flag for recvmsg
      and passes it just like the existing MSG_CMSG_COMPAT flag.  I think this bit
      is not used otherwise but the networking people will know better.
      
      This new flag is not recognized by recvfrom and recv.  These functions cannot
      be used for that purpose and the asymmetry this introduces is not worse than
      the already existing MSG_CMSG_COMPAT situations.
      
      The patch must be applied on the patch which introduced O_CLOEXEC.  It has to
      remove static from the new get_unused_fd_flags function but since scm.c cannot
      live in a module the function still hasn't to be exported.
      
      Here's a test program to make sure the code works.  It's so much longer than
      the actual patch...
      
      #include <errno.h>
      #include <error.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/socket.h>
      #include <sys/un.h>
      
      #ifndef O_CLOEXEC
      # define O_CLOEXEC 02000000
      #endif
      #ifndef MSG_CMSG_CLOEXEC
      # define MSG_CMSG_CLOEXEC 0x40000000
      #endif
      
      int
      main (int argc, char *argv[])
      {
        if (argc > 1)
          {
            int fd = atol (argv[1]);
            printf ("child: fd = %d\n", fd);
            if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
              {
                puts ("file descriptor valid in child");
                return 1;
              }
            return 0;
      
          }
      
        struct sockaddr_un sun;
        strcpy (sun.sun_path, "./testsocket");
        sun.sun_family = AF_UNIX;
      
        char databuf[] = "hello";
        struct iovec iov[1];
        iov[0].iov_base = databuf;
        iov[0].iov_len = sizeof (databuf);
      
        union
        {
          struct cmsghdr hdr;
          char bytes[CMSG_SPACE (sizeof (int))];
        } buf;
        struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
                              .msg_control = buf.bytes,
                              .msg_controllen = sizeof (buf) };
        struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);
      
        cmsg->cmsg_level = SOL_SOCKET;
        cmsg->cmsg_type = SCM_RIGHTS;
        cmsg->cmsg_len = CMSG_LEN (sizeof (int));
      
        msg.msg_controllen = cmsg->cmsg_len;
      
        pid_t child = fork ();
        if (child == -1)
          error (1, errno, "fork");
        if (child == 0)
          {
            int sock = socket (PF_UNIX, SOCK_STREAM, 0);
            if (sock < 0)
              error (1, errno, "socket");
      
            if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
              error (1, errno, "bind");
            if (listen (sock, SOMAXCONN) < 0)
              error (1, errno, "listen");
      
            int conn = accept (sock, NULL, NULL);
            if (conn == -1)
              error (1, errno, "accept");
      
            *(int *) CMSG_DATA (cmsg) = sock;
            if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
              error (1, errno, "sendmsg");
      
            return 0;
          }
      
        /* For a test suite this should be more robust like a
           barrier in shared memory.  */
        sleep (1);
      
        int sock = socket (PF_UNIX, SOCK_STREAM, 0);
        if (sock < 0)
          error (1, errno, "socket");
      
        if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
          error (1, errno, "connect");
        unlink (sun.sun_path);
      
        *(int *) CMSG_DATA (cmsg) = -1;
      
        if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
          error (1, errno, "recvmsg");
      
        int fd = *(int *) CMSG_DATA (cmsg);
        if (fd == -1)
          error (1, 0, "no descriptor received");
      
        char fdname[20];
        snprintf (fdname, sizeof (fdname), "%d", fd);
        execl ("/proc/self/exe", argv[0], fdname, NULL);
        puts ("execl failed");
        return 1;
      }
      
      [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michael Buesch <mb@bu3sch.de>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4a19542e
  19. 15 7月, 2007 3 次提交