1. 11 10月, 2007 20 次提交
    • E
      [NET]: Factor out __dev_alloc_name from dev_alloc_name · b267b179
      Eric W. Biederman 提交于
      When forcibly changing the network namespace of a device
      I need something that can generate a name for the device
      in the new namespace without overwriting the old name.
      
      __dev_alloc_name provides me that functionality.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b267b179
    • E
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman 提交于
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      881d966b
    • E
      [NET]: Support multiple network namespaces with netlink · b4b51029
      Eric W. Biederman 提交于
      Each netlink socket will live in exactly one network namespace,
      this includes the controlling kernel sockets.
      
      This patch updates all of the existing netlink protocols
      to only support the initial network namespace.  Request
      by clients in other namespaces will get -ECONREFUSED.
      As they would if the kernel did not have the support for
      that netlink protocol compiled in.
      
      As each netlink protocol is updated to be multiple network
      namespace safe it can register multiple kernel sockets
      to acquire a presence in the rest of the network namespaces.
      
      The implementation in af_netlink is a simple filter implementation
      at hash table insertion and hash table look up time.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4b51029
    • E
      [NET]: Make device event notification network namespace safe · e9dc8653
      Eric W. Biederman 提交于
      Every user of the network device notifiers is either a protocol
      stack or a pseudo device.  If a protocol stack that does not have
      support for multiple network namespaces receives an event for a
      device that is not in the initial network namespace it quite possibly
      can get confused and do the wrong thing.
      
      To avoid problems until all of the protocol stacks are converted
      this patch modifies all netdev event handlers to ignore events on
      devices that are not in the initial network namespace.
      
      As the rest of the code is made network namespace aware these
      checks can be removed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9dc8653
    • E
      [NET]: Initialize the network namespace of network devices. · 6d34b1c2
      Eric W. Biederman 提交于
      Except for carefully selected pseudo devices all network
      interfaces should start out in the initial network namespace.
      Ultimately it will be register_netdev that examines what
      dev->nd_net is set to and places a device in a network namespace.
      
      This patch modifies alloc_netdev to initialize the network
      namespace a device is in with the initial network namespace.
      This gets it right for the vast majority of devices so their
      drivers need not be modified and for those few pseudo devices
      that need something different they can change this parameter
      before calling register_netdevice.
      
      The network namespace parameter on a network device is not
      reference counted as the devices are inside of a network namespace
      and cannot remain in that namespace past the lifetime of the
      network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d34b1c2
    • E
      [NET]: Make socket creation namespace safe. · 1b8d7ae4
      Eric W. Biederman 提交于
      This patch passes in the namespace a new socket should be created in
      and has the socket code do the appropriate reference counting.  By
      virtue of this all socket create methods are touched.  In addition
      the socket create methods are modified so that they will fail if
      you attempt to create a socket in a non-default network namespace.
      
      Failing if we attempt to create a socket outside of the default
      network namespace ensures that as we incrementally make the network stack
      network namespace aware we will not export functionality that someone
      has not audited and made certain is network namespace safe.
      Allowing us to partially enable network namespaces before all of the
      exotic protocols are supported.
      
      Any protocol layers I have missed will fail to compile because I now
      pass an extra parameter into the socket creation code.
      
      [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b8d7ae4
    • E
      [NET]: Make /proc/net per network namespace · 457c4cbc
      Eric W. Biederman 提交于
      This patch makes /proc/net per network namespace.  It modifies the global
      variables proc_net and proc_net_stat to be per network namespace.
      The proc_net file helpers are modified to take a network namespace argument,
      and all of their callers are fixed to pass &init_net for that argument.
      This ensures that all of the /proc/net files are only visible and
      usable in the initial network namespace until the code behind them
      has been updated to be handle multiple network namespaces.
      
      Making /proc/net per namespace is necessary as at least some files
      in /proc/net depend upon the set of network devices which is per
      network namespace, and even more files in /proc/net have contents
      that are relevant to a single network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      457c4cbc
    • E
      [NET]: Basic network namespace infrastructure. · 5f256bec
      Eric W. Biederman 提交于
      This is the basic infrastructure needed to support network
      namespaces.  This infrastructure is:
      - Registration functions to support initializing per network
        namespace data when a network namespaces is created or destroyed.
      
      - struct net.  The network namespace data structure.
        This structure will grow as variables are made per network
        namespace but this is the minimal starting point.
      
      - Functions to grab a reference to the network namespace.
        I provide both get/put functions that keep a network namespace
        from being freed.  And hold/release functions serve as weak references
        and will warn if their count is not zero when the data structure
        is freed.  Useful for dealing with more complicated data structures
        like the ipv4 route cache.
      
      - A list of all of the network namespaces so we can iterate over them.
      
      - A slab for the network namespace data structure allowing leaks
        to be spotted.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f256bec
    • J
      [NET]: Change type of owner in sock_lock_t to int, rename · d2e9117c
      John Heffner 提交于
      The type of owner in sock_lock_t is currently (struct sock_iocb *),
      presumably for historical reasons.  It is never used as this type, only
      tested as NULL or set to (void *)1.  For clarity, this changes it to type
      int, and renames to owned, to avoid any possible type casting errors.
      Signed-off-by: NJohn Heffner <jheffner@psc.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2e9117c
    • R
      [PKTGEN]: Remove softirq scheduling. · b163911f
      Robert Olsson 提交于
      It's not a job for pktgen.
      Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b163911f
    • R
      [PKTGEN]: Multiqueue support. · 45b270f8
      Robert Olsson 提交于
      Below some pktgen support to send into different TX queues.
      This can of course be feed into input queues on other machines
      Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45b270f8
    • J
      [ETHTOOL]: Internal cleanup of ethtool_value-related handlers · 13c99b24
      Jeff Garzik 提交于
      Several get/set functions can be handled by a passing the ethtool_op
      function pointer directly to a generic function.  This permits deletion
      of a fair bit of redundant code.
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13c99b24
    • J
    • J
    • J
      [ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctls · 3ae7c0b2
      Jeff Garzik 提交于
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ae7c0b2
    • S
      [NET] netconsole: Support dynamic reconfiguration using configfs · 0bcc1816
      Satyam Sharma 提交于
      Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>.
      
      This patch introduces support for dynamic reconfiguration (adding, removing
      and/or modifying parameters of netconsole targets at runtime) using a
      userspace interface exported via configfs.  Documentation is also updated
      accordingly.
      
      Issues and brief design overview:
      
      (1) Kernel-initiated creation / destruction of kernel objects is not
          possible with configfs -- the lifetimes of the "config items" is managed
          exclusively from userspace.  But netconsole must support boot/module
          params too, and these are parsed in kernel and hence netpolls must be
          setup from the kernel.  Joel Becker suggested to separately manage the
          lifetimes of the two kinds of netconsole_target objects -- those created
          via configfs mkdir(2) from userspace and those specified from the
          boot/module option string.  This adds complexity and some redundancy here
          and also means that boot/module param-created targets are not exposed
          through the configfs namespace (and hence cannot be updated / destroyed
          dynamically).  However, this saves us from locking / refcounting
          complexities that would need to be introduced in configfs to support
          kernel-initiated item creation / destroy there.
      
      (2) In configfs, item creation takes place in the call chain of the
          mkdir(2) syscall in the driver subsystem.  If we used an ioctl(2) to
          create / destroy objects from userspace, the special userspace program is
          able to fill out the structure to be passed into the ioctl and hence
          specify attributes such as local interface that are required at the time
          we set up the netpoll.  For configfs, this information is not available at
          the time of mkdir(2).  So, we keep all newly-created targets (via
          configfs) disabled by default.  The user is expected to set various
          attributes appropriately (including the local network interface if
          required) and then write(2) "1" to the "enabled" attribute.  Thus,
          netpoll_setup() is then called on the set parameters in the context of
          _this_ write(2) on the "enabled" attribute itself.  This design enables
          the user to reconfigure existing netconsole targets at runtime to be
          attached to newly-come-up interfaces that may not have existed when
          netconsole was loaded or when the targets were actually created.  All this
          effectively enables us to get rid of custom ioctls.
      
      (3) Ultra-paranoid configfs attribute show() and store() operations, with
          sanity and input range checking, using only safe string primitives, and
          compliant with the recommendations in Documentation/filesystems/sysfs.txt.
      
      (4) A new function netpoll_print_options() is created in the netpoll API,
          that just prints out the configured parameters for a netpoll structure.
          netpoll_parse_options() is modified to use that and it is also exported to
          be used from netconsole.
      Signed-off-by: NSatyam Sharma <satyam@infradead.org>
      Acked-by: NKeiichi Kii <k-keiichi@bx.jp.nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0bcc1816
    • T
      [NEIGH]: Netlink notifications · d961db35
      Thomas Graf 提交于
      Currently neighbour event notifications are limited to update
      notifications and only sent if the ARP daemon is enabled. This
      patch extends the existing notification code by also reporting
      neighbours being removed due to gc or administratively and
      removes the dependency on the ARP daemon. This allows to keep
      track of neighbour states without periodically fetching the
      complete neighbour table.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d961db35
    • T
      [NEIGH]: Combine neighbour cleanup and release · 4f494554
      Thomas Graf 提交于
      Introduces neigh_cleanup_and_release() to be used after a
      neighbour has been removed from its neighbour table. Serves
      as preparation to add event notifications.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f494554
    • P
      [RTNETLINK]: Introduce generic rtnl_create_link(). · e7199288
      Pavel Emelianov 提交于
      This routine gets the parsed rtnl attributes and creates a new
      link with generic info (IFLA_LINKINFO policy). Its intention
      is to help the drivers, that need to create several links at
      once (like VETH).
      
      This is nothing but a copy-paste-ed part of rtnl_newlink() function
      that is responsible for creation of new device.
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7199288
    • S
      [NET]: Make NAPI polling independent of struct net_device objects. · bea3348e
      Stephen Hemminger 提交于
      Several devices have multiple independant RX queues per net
      device, and some have a single interrupt doorbell for several
      queues.
      
      In either case, it's easier to support layouts like that if the
      structure representing the poll is independant from the net
      device itself.
      
      The signature of the ->poll() call back goes from:
      
      	int foo_poll(struct net_device *dev, int *budget)
      
      to
      
      	int foo_poll(struct napi_struct *napi, int budget)
      
      The caller is returned the number of RX packets processed (or
      the number of "NAPI credits" consumed if you want to get
      abstract).  The callee no longer messes around bumping
      dev->quota, *budget, etc. because that is all handled in the
      caller upon return.
      
      The napi_struct is to be embedded in the device driver private data
      structures.
      
      Furthermore, it is the driver's responsibility to disable all NAPI
      instances in it's ->stop() device close handler.  Since the
      napi_struct is privatized into the driver's private data structures,
      only the driver knows how to get at all of the napi_struct instances
      it may have per-device.
      
      With lots of help and suggestions from Rusty Russell, Roland Dreier,
      Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
      
      Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
      Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
      
      [ Ported to current tree and all drivers converted.  Integrated
        Stephen's follow-on kerneldoc additions, and restored poll_list
        handling to the old style to fix mutual exclusion issues.  -DaveM ]
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bea3348e
  2. 17 9月, 2007 1 次提交
  3. 15 9月, 2007 1 次提交
    • D
      [NET]: Fix two issues wrt. SO_BINDTODEVICE. · 4878809f
      David S. Miller 提交于
      1) Comments suggest that setting optlen to zero will unbind
         the socket from whatever device it might be attached to.  This
         hasn't been the case since at least 2.2.x because the first thing
         this function does is return -EINVAL if 'optlen' is less than
         sizeof(int).
      
         This check also means that passing in a two byte string doesn't
         work so well.  It's almost as if this code was testing with "eth?"
         patterned strings and nothing else :-)
      
         Fix this by breaking the logic of this facility out into a
         seperate function which validates optlen more appropriately.
      
         The optlen==0 and small string cases now work properly.
      
      2) We should reset the cached route of the socket after we have made
         the device binding changes, not before.
      
      Reported by Ben Greear.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4878809f
  4. 11 9月, 2007 1 次提交
  5. 31 8月, 2007 1 次提交
  6. 29 8月, 2007 1 次提交
  7. 27 8月, 2007 2 次提交
  8. 15 8月, 2007 1 次提交
  9. 14 8月, 2007 1 次提交
  10. 08 8月, 2007 1 次提交
  11. 01 8月, 2007 3 次提交
  12. 31 7月, 2007 6 次提交
  13. 22 7月, 2007 1 次提交