1. 14 12月, 2009 1 次提交
    • E
      net: Fix userspace RTM_NEWLINK notifications. · d90a909e
      Eric W. Biederman 提交于
      I received some bug reports about userspace programs having problems
      because after RTM_NEWLINK was received they could not immediate access
      files under /proc/sys/net/ because they had not been registered yet.
      
      The original problem was trivially fixed by moving the userspace
      notification from rtnetlink_event() to the end of
      register_netdevice().
      
      When testing that change I discovered I was still getting RTM_NEWLINK
      events before I could access proc and I was also getting RTM_NEWLINK
      events after I was seeing RTM_DELLINK.  Things practically guaranteed
      to confuse userspace.
      
      After a little more investigation these extra notifications proved to
      be from the new notifiers NETDEV_POST_INIT and NETDEV_UNREGISTER_BATCH
      hitting the default case in rtnetlink_event, and triggering
      unnecessary RTM_NEWLINK messages.
      
      rtnetlink_event now explicitly handles NETDEV_UNREGISTER_BATCH and
      NETDEV_POST_INIT to avoid sending the incorrect userspace
      notifications.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d90a909e
  2. 08 11月, 2009 1 次提交
    • E
      net: Support specifying the network namespace upon device creation. · 81adee47
      Eric W. Biederman 提交于
      There is no good reason to not support userspace specifying the
      network namespace during device creation, and it makes it easier
      to create a network device and pass it to a child network namespace
      with a well known name.
      
      We have to be careful to ensure that the target network namespace
      for the new device exists through the life of the call.  To keep
      that logic clear I have factored out the network namespace grabbing
      logic into rtnl_link_get_net.
      
      In addtion we need to continue to pass the source network namespace
      to the rtnl_link_ops.newlink method so that we can find the base
      device source network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      81adee47
  3. 07 11月, 2009 1 次提交
  4. 28 10月, 2009 1 次提交
  5. 24 10月, 2009 1 次提交
  6. 22 10月, 2009 1 次提交
  7. 06 9月, 2009 1 次提交
    • P
      net_sched: reintroduce dev->qdisc for use by sch_api · af356afa
      Patrick McHardy 提交于
      Currently the multiqueue integration with the qdisc API suffers from
      a few problems:
      
      - with multiple queues, all root qdiscs use the same handle. This means
        they can't be exposed to userspace in a backwards compatible fashion.
      
      - all API operations always refer to queue number 0. Newly created
        qdiscs are automatically shared between all queues, its not possible
        to address individual queues or restore multiqueue behaviour once a
        shared qdisc has been attached.
      
      - Dumps only contain the root qdisc of queue 0, in case of non-shared
        qdiscs this means the statistics are incomplete.
      
      This patch reintroduces dev->qdisc, which points to the (single) root qdisc
      from userspace's point of view. Currently it either points to the first
      (non-shared) default qdisc, or a qdisc shared between all queues. The
      following patches will introduce a classful dummy qdisc, which will be used
      as root qdisc and contain the per-queue qdiscs as children.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af356afa
  8. 03 9月, 2009 1 次提交
  9. 13 7月, 2009 1 次提交
  10. 25 2月, 2009 1 次提交
    • P
      netlink: change nlmsg_notify() return value logic · 1ce85fe4
      Pablo Neira Ayuso 提交于
      This patch changes the return value of nlmsg_notify() as follows:
      
      If NETLINK_BROADCAST_ERROR is set by any of the listeners and
      an error in the delivery happened, return the broadcast error;
      else if there are no listeners apart from the socket that
      requested a change with the echo flag, return the result of the
      unicast notification. Thus, with this patch, the unicast
      notification is handled in the same way of a broadcast listener
      that has set the NETLINK_BROADCAST_ERROR socket flag.
      
      This patch is useful in case that the caller of nlmsg_notify()
      wants to know the result of the delivery of a netlink notification
      (including the broadcast delivery) and take any action in case
      that the delivery failed. For example, ctnetlink can drop packets
      if the event delivery failed to provide reliable logging and
      state-synchronization at the cost of dropping packets.
      
      This patch also modifies the rtnetlink code to ignore the return
      value of rtnl_notify() in all callers. The function rtnl_notify()
      (before this patch) returned the error of the unicast notification
      which makes rtnl_set_sk_err() reports errors to all listeners. This
      is not of any help since the origin of the change (the socket that
      requested the echoing) notices the ENOBUFS error if the notification
      fails and should resync itself.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ce85fe4
  11. 20 11月, 2008 2 次提交
    • S
      netdev: introduce dev_get_stats() · eeda3fd6
      Stephen Hemminger 提交于
      In order for the network device ops get_stats call to be immutable, the handling
      of the default internal network device stats block has to be changed. Add a new
      helper function which replaces the old use of internal_get_stats.
      
      Note: change return code to make it clear that the caller should not
      go changing the returned statistics.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeda3fd6
    • S
      netdev: network device operations infrastructure · d314774c
      Stephen Hemminger 提交于
      This patch changes the network device internal API to move adminstrative
      operations out of the network device structure and into a separate structure.
      
      This patch involves some hackery to maintain compatablity between the
      new and old model, so all 300+ drivers don't have to be changed at once.
      For drivers that aren't converted yet, the netdevice_ops virt function list
      still resides in the net_device structure. For old protocols, the new
      net_device_ops are copied out to the old net_device pointers.
      
      After the transistion is completed the nag message can be changed to
      an WARN_ON, and the compatiablity code can be made configurable.
      
      Some function pointers aren't moved:
      * destructor can't be in net_device_ops because
        it may need to be referenced after the module is unloaded.
      * neighbor setup is manipulated in a couple of places that need special
        consideration
      * hard_start_xmit is in the fast path for transmit.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d314774c
  12. 17 11月, 2008 1 次提交
  13. 17 10月, 2008 1 次提交
  14. 08 10月, 2008 1 次提交
    • H
      net: Fix netdev_run_todo dead-lock · 58ec3b4d
      Herbert Xu 提交于
      Benjamin Thery tracked down a bug that explains many instances
      of the error
      
      unregister_netdevice: waiting for %s to become free. Usage count = %d
      
      It turns out that netdev_run_todo can dead-lock with itself if
      a second instance of it is run in a thread that will then free
      a reference to the device waited on by the first instance.
      
      The problem is really quite silly.  We were trying to create
      parallelism where none was required.  As netdev_run_todo always
      follows a RTNL section, and that todo tasks can only be added
      with the RTNL held, by definition you should only need to wait
      for the very ones that you've added and be done with it.
      
      There is no need for a second mutex or spinlock.
      
      This is exactly what the following patch does.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58ec3b4d
  15. 23 9月, 2008 1 次提交
    • S
      net: network device name ifalias support · 0b815a1a
      Stephen Hemminger 提交于
      This patch add support for keeping an additional character alias
      associated with an network interface. This is useful for maintaining
      the SNMP ifAlias value which is a user defined value. Routers use this
      to hold information like which circuit or line it is connected to. It
      is just an arbitrary text label on the network device.
      
      There are two exposed interfaces with this patch, the value can be
      read/written either via netlink or sysfs.
      
      This could be maintained just by the snmp daemon, but it is more
      generally useful for other management tools, and the kernel is good
      place to act as an agreed upon interface to store it.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b815a1a
  16. 18 7月, 2008 1 次提交
    • D
      netdev: Allocate multiple queues for TX. · e8a0464c
      David S. Miller 提交于
      alloc_netdev_mq() now allocates an array of netdev_queue
      structures for TX, based upon the queue_count argument.
      
      Furthermore, all accesses to the TX queues are now vectored
      through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
      interfaces.  This makes it easy to grep the tree for all
      things that want to get to a TX queue of a net device.
      
      Problem spots which are not really multiqueue aware yet, and
      only work with one queue, can easily be spotted by grepping
      for all netdev_get_tx_queue() calls that pass in a zero index.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8a0464c
  17. 09 7月, 2008 1 次提交
  18. 04 6月, 2008 1 次提交
  19. 22 5月, 2008 1 次提交
  20. 24 4月, 2008 1 次提交
    • P
      [RTNETLINK]: Fix bogus ASSERT_RTNL warning · c9c1014b
      Patrick McHardy 提交于
      ASSERT_RTNL uses mutex_trylock to test whether the rtnl_mutex is
      held. This bogus warnings when running in atomic context, which
      f.e. happens when adding secondary unicast addresses through
      macvlan or vlan or when synchronizing multicast addresses from
      wireless devices.
      
      Mid-term we might want to consider moving all address updates
      to process context since the locking seems overly complicated,
      for now just fix the bogus warning by changing ASSERT_RTNL to
      use mutex_is_locked().
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9c1014b
  21. 16 4月, 2008 2 次提交
  22. 26 3月, 2008 2 次提交
  23. 24 2月, 2008 1 次提交
    • T
      [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK · 1840bb13
      Thomas Graf 提交于
      RTM_NEWLINK allows for already existing links to be modified. For this
      purpose do_setlink() is called which expects address attributes with a
      payload length of at least dev->addr_len. This patch adds the necessary
      validation for the RTM_NEWLINK case.
      
      The address length for links to be created is not checked for now as the
      actual attribute length is used when copying the address to the netdevice
      structure. It might make sense to report an error if less than addr_len
      bytes are provided but enforcing this might break drivers trying to be
      smart with not transmitting all zero addresses.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1840bb13
  24. 20 2月, 2008 1 次提交
  25. 18 2月, 2008 1 次提交
  26. 13 2月, 2008 1 次提交
    • L
      [RTNETLINK]: Send a single notification on device state changes. · 45b50354
      Laszlo Attila Toth 提交于
      In do_setlink() a single notification is sent at the end of the
      function if any modification occured. If the address has been changed,
      another notification is sent.
      
      Both of them is required because originally only the NETDEV_CHANGEADDR
      notification was sent and although device state change implies address
      change, some programs may expect the original notification. It remains
      for compatibity.
      
      If set_operstate() is called from do_setlink(), it doesn't send a
      notification, only if it is called from rtnl_create_link() as earlier.
      Signed-off-by: NLaszlo Attila Toth <panther@balabit.hu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45b50354
  27. 05 2月, 2008 1 次提交
  28. 29 1月, 2008 6 次提交
  29. 21 1月, 2008 1 次提交
    • P
      [NET]: rtnl_link: fix use-after-free · 68365458
      Patrick McHardy 提交于
      When unregistering the rtnl_link_ops, all existing devices using
      the ops are destroyed. With nested devices this may lead to a
      use-after-free despite the use of for_each_netdev_safe() in case
      the upper device is next in the device list and is destroyed
      by the NETDEV_UNREGISTER notifier.
      
      The easy fix is to restart scanning the device list after removing
      a device. Alternatively we could add new devices to the front of
      the list to avoid having dependant devices follow the device they
      depend on. A third option would be to only restart scanning if
      dev->iflink of the next device matches dev->ifindex of the current
      one. For now this seems like the safest solution.
      
      With this patch, the veth rtnl_link_ops unregistration can use
      rtnl_link_unregister() directly since it now also handles destruction
      of multiple devices at once.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68365458
  30. 27 10月, 2007 1 次提交
  31. 20 10月, 2007 1 次提交
    • P
      Make access to task's nsproxy lighter · cf7b708c
      Pavel Emelyanov 提交于
      When someone wants to deal with some other taks's namespaces it has to lock
      the task and then to get the desired namespace if the one exists.  This is
      slow on read-only paths and may be impossible in some cases.
      
      E.g.  Oleg recently noticed a race between unshare() and the (sent for
      review in cgroups) pid namespaces - when the task notifies the parent it
      has to know the parent's namespace, but taking the task_lock() is
      impossible there - the code is under write locked tasklist lock.
      
      On the other hand switching the namespace on task (daemonize) and releasing
      the namespace (after the last task exit) is rather rare operation and we
      can sacrifice its speed to solve the issues above.
      
      The access to other task namespaces is proposed to be performed
      like this:
      
           rcu_read_lock();
           nsproxy = task_nsproxy(tsk);
           if (nsproxy != NULL) {
                   / *
                     * work with the namespaces here
                     * e.g. get the reference on one of them
                     * /
           } / *
               * NULL task_nsproxy() means that this task is
               * almost dead (zombie)
               * /
           rcu_read_unlock();
      
      This patch has passed the review by Eric and Oleg :) and,
      of course, tested.
      
      [clg@fr.ibm.com: fix unshare()]
      [ebiederm@xmission.com: Update get_net_ns_by_pid]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf7b708c
  32. 11 10月, 2007 1 次提交
    • D
      [NET]: make netlink user -> kernel interface synchronious · cd40b7d3
      Denis V. Lunev 提交于
      This patch make processing netlink user -> kernel messages synchronious.
      This change was inspired by the talk with Alexey Kuznetsov about current
      netlink messages processing. He says that he was badly wrong when introduced 
      asynchronious user -> kernel communication.
      
      The call netlink_unicast is the only path to send message to the kernel
      netlink socket. But, unfortunately, it is also used to send data to the
      user.
      
      Before this change the user message has been attached to the socket queue
      and sk->sk_data_ready was called. The process has been blocked until all
      pending messages were processed. The bad thing is that this processing
      may occur in the arbitrary process context.
      
      This patch changes nlk->data_ready callback to get 1 skb and force packet
      processing right in the netlink_unicast.
      
      Kernel -> user path in netlink_unicast remains untouched.
      
      EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
      drop, but the process remains in the cycle until the message will be fully
      processed. So, there is no need to use this kludges now.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd40b7d3