1. 28 11月, 2010 1 次提交
    • T
      rtnl: make link af-specific updates atomic · cf7afbfe
      Thomas Graf 提交于
      As David pointed out correctly, updates to af-specific attributes
      are currently not atomic. If multiple changes are requested and
      one of them fails, previous updates may have been applied already
      leaving the link behind in a undefined state.
      
      This patch splits the function parse_link_af() into two functions
      validate_link_af() and set_link_at(). validate_link_af() is placed
      to validate_linkmsg() check for errors as early as possible before
      any changes to the link have been made. set_link_af() is called to
      commit the changes later.
      
      This method is not fail proof, while it is currently sufficient
      to make set_link_af() inerrable and thus 100% atomic, the
      validation function method will not be able to detect all error
      scenarios in the future, there will likely always be errors
      depending on states which are f.e. not protected by rtnl_mutex
      and thus may change between validation and setting.
      
      Also, instead of silently ignoring unknown address families and
      config blocks for address families which did not register a set
      function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are
      returned to avoid comitting 4 out of 5 update requests without
      notifying the user.
      Signed-off-by: NThomas Graf <tgraf@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf7afbfe
  2. 18 11月, 2010 1 次提交
    • T
      rtnetlink: Link address family API · f8ff182c
      Thomas Graf 提交于
      Each net_device contains address family specific data such as
      per device settings and statistics. We already expose this data
      via procfs/sysfs and partially netlink.
      
      The netlink method requires the requester to send one RTM_GETLINK
      request for each address family it wishes to receive data of
      and then merge this data itself.
      
      This patch implements a new API which combines all address family
      specific link data in a new netlink attribute IFLA_AF_SPEC.
      IFLA_AF_SPEC contains a sequence of nested attributes, one for each
      address family which in turn defines the structure of its own
      attribute. Example:
      
         [IFLA_AF_SPEC] = {
             [AF_INET] = {
                 [IFLA_INET_CONF] = ...,
             },
             [AF_INET6] = {
                 [IFLA_INET6_FLAGS] = ...,
                 [IFLA_INET6_CONF] = ...,
             }
         }
      
      The API also allows for address families to implement a function
      which parses the IFLA_AF_SPEC attribute sent by userspace to
      implement address family specific link options.
      Signed-off-by: NThomas Graf <tgraf@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8ff182c
  3. 13 11月, 2010 1 次提交
    • T
      rtnetlink: Fix message size calculation for link messages · 369cf77a
      Thomas Graf 提交于
      nlmsg_total_size() calculates the length of a netlink message
      including header and alignment. nla_total_size() calculates the
      space an individual attribute consumes which was meant to be used
      in this context.
      
      Also, ensure to account for the attribute header for the
      IFLA_INFO_XSTATS attribute as implementations of get_xstats_size()
      seem to assume that we do so.
      
      The addition of two message headers minus the missing attribute
      header resulted in a calculated message size that was larger than
      required. Therefore we never risked running out of skb tailroom.
      Signed-off-by: NThomas Graf <tgraf@infradead.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      369cf77a
  4. 21 10月, 2010 1 次提交
  5. 24 8月, 2010 1 次提交
  6. 13 7月, 2010 1 次提交
  7. 08 7月, 2010 1 次提交
    • E
      net: fix 64 bit counters on 32 bit arches · 28172739
      Eric Dumazet 提交于
      There is a small possibility that a reader gets incorrect values on 32
      bit arches. SNMP applications could catch incorrect counters when a
      32bit high part is changed by another stats consumer/provider.
      
      One way to solve this is to add a rtnl_link_stats64 param to all
      ndo_get_stats64() methods, and also add such a parameter to
      dev_get_stats().
      
      Rule is that we are not allowed to use dev->stats64 as a temporary
      storage for 64bit stats, but a caller provided area (usually on stack)
      
      Old drivers (only providing get_stats() method) need no changes.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28172739
  8. 13 6月, 2010 1 次提交
    • B
      net: Enable 64-bit net device statistics on 32-bit architectures · be1f3c2c
      Ben Hutchings 提交于
      Use struct rtnl_link_stats64 as the statistics structure.
      
      On 32-bit architectures, insert 32 bits of padding after/before each
      field of struct net_device_stats to make its layout compatible with
      struct rtnl_link_stats64.  Add an anonymous union in net_device; move
      stats into the union and add struct rtnl_link_stats64 stats64.
      
      Add net_device_ops::ndo_get_stats64, implementations of which will
      return a pointer to struct rtnl_link_stats64.  Drivers that implement
      this operation must not update the structure asynchronously.
      
      Change dev_get_stats() to call ndo_get_stats64 if available, and to
      return a pointer to struct rtnl_link_stats64.  Change callers of
      dev_get_stats() accordingly.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be1f3c2c
  9. 28 5月, 2010 2 次提交
  10. 24 5月, 2010 1 次提交
    • D
      rtnetlink: Fix error handling in do_setlink() · 253683bb
      David Howells 提交于
      Commit c02db8c6:
      
      	Author:  Chris Wright <chrisw@sous-sol.org>
      	Date:    Sun May 16 01:05:45 2010 -0700
      	Subject: rtnetlink: make SR-IOV VF interface symmetric
      
      adds broken error handling to do_setlink() in net/core/rtnetlink.c.  The
      problem is the following chunk of code:
      
      	if (tb[IFLA_VFINFO_LIST]) {
      		struct nlattr *attr;
      		int rem;
      		nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) {
      			if (nla_type(attr) != IFLA_VF_INFO)
        ---->				goto errout;
      			err = do_setvfinfo(dev, attr);
      			if (err < 0)
      				goto errout;
      			modified = 1;
      		}
      	}
      
      which can get to errout without setting err, resulting in the following error:
      
      net/core/rtnetlink.c: In function 'do_setlink':
      net/core/rtnetlink.c:904: warning: 'err' may be used uninitialized in this function
      
      Change the code to return -EINVAL in this case.  Note that this might not be
      the appropriate error though.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Chris Wright <chrisw@sous-sol.org>
      cc: David S. Miller <davem@davemloft.net>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      253683bb
  11. 18 5月, 2010 1 次提交
    • S
      net: Add netlink support for virtual port management (was iovnl) · 57b61080
      Scott Feldman 提交于
      Add new netdev ops ndo_{set|get}_vf_port to allow setting of
      port-profile on a netdev interface.  Extends netlink socket RTM_SETLINK/
      RTM_GETLINK with two new sub msgs called IFLA_VF_PORTS and IFLA_PORT_SELF
      (added to end of IFLA_cmd list).  These are both nested atrtibutes
      using this layout:
      
                    [IFLA_NUM_VF]
                    [IFLA_VF_PORTS]
                            [IFLA_VF_PORT]
                                    [IFLA_PORT_*], ...
                            [IFLA_VF_PORT]
                                    [IFLA_PORT_*], ...
                            ...
                    [IFLA_PORT_SELF]
                            [IFLA_PORT_*], ...
      
      These attributes are design to be set and get symmetrically.  VF_PORTS
      is a list of VF_PORTs, one for each VF, when dealing with an SR-IOV
      device.  PORT_SELF is for the PF of the SR-IOV device, in case it wants
      to also have a port-profile, or for the case where the VF==PF, like in
      enic patch 2/2 of this patch set.
      
      A port-profile is used to configure/enable the external switch virtual port
      backing the netdev interface, not to configure the host-facing side of the
      netdev.  A port-profile is an identifier known to the switch.  How port-
      profiles are installed on the switch or how available port-profiles are
      made know to the host is outside the scope of this patch.
      
      There are two types of port-profiles specs in the netlink msg.  The first spec
      is for 802.1Qbg (pre-)standard, VDP protocol.  The second spec is for devices
      that run a similar protocol as VDP but in firmware, thus hiding the protocol
      details.  In either case, the specs have much in common and makes sense to
      define the netlink msg as the union of the two specs.  For example, both specs
      have a notition of associating/deassociating a port-profile.  And both specs
      require some information from the hypervisor manager, such as client port
      instance ID.
      
      The general flow is the port-profile is applied to a host netdev interface
      using RTM_SETLINK, the receiver of the RTM_SETLINK msg communicates with the
      switch, and the switch virtual port backing the host netdev interface is
      configured/enabled based on the settings defined by the port-profile.  What
      those settings comprise, and how those settings are managed is again
      outside the scope of this patch, since this patch only deals with the
      first step in the flow.
      Signed-off-by: NScott Feldman <scofeldm@cisco.com>
      Signed-off-by: NRoopa Prabhu <roprabhu@cisco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57b61080
  12. 16 5月, 2010 1 次提交
  13. 26 4月, 2010 1 次提交
    • P
      net: rtnetlink: decouple rtnetlink address families from real address families · 25239cee
      Patrick McHardy 提交于
      Decouple rtnetlink address families from real address families in socket.h to
      be able to add rtnetlink interfaces to code that is not a real address family
      without increasing AF_MAX/NPROTO.
      
      This will be used to add support for multicast route dumping from all tables
      as the proc interface can't be extended to support anything but the main table
      without breaking compatibility.
      
      This partialy undoes the patch to introduce independant families for routing
      rules and converts ipmr routing rules to a new rtnetlink family. Similar to
      that patch, values up to 127 are reserved for real address families, values
      above that may be used arbitrarily.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      25239cee
  14. 23 4月, 2010 1 次提交
  15. 14 4月, 2010 1 次提交
  16. 28 3月, 2010 2 次提交
  17. 22 3月, 2010 1 次提交
  18. 17 3月, 2010 1 次提交
  19. 27 2月, 2010 4 次提交
    • P
      rtnetlink: support specifying device flags on device creation · 3729d502
      Patrick McHardy 提交于
      commit e8469ed959c373c2ff9e6f488aa5a14971aebe1f
      Author: Patrick McHardy <kaber@trash.net>
      Date:   Tue Feb 23 20:41:30 2010 +0100
      
      Support specifying the initial device flags when creating a device though
      rtnl_link. Devices allocated by rtnl_create_link() are marked as INITIALIZING
      in order to surpress netlink registration notifications. To complete setup,
      rtnl_configure_link() must be called, which performs the device flag changes
      and invokes the deferred notifiers if everything went well.
      
      Two examples:
      
      # add macvlan to eth0
      #
      $ ip link add link eth0 up allmulticast on type macvlan
      
      [LINK]11: macvlan0@eth0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
          link/ether 26:f8:84:02:f9:2a brd ff:ff:ff:ff:ff:ff
      [ROUTE]ff00::/8 dev macvlan0  table local  metric 256  mtu 1500 advmss 1440 hoplimit 0
      [ROUTE]fe80::/64 dev macvlan0  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
      [LINK]11: macvlan0@eth0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500
          link/ether 26:f8:84:02:f9:2a
      [ADDR]11: macvlan0    inet6 fe80::24f8:84ff:fe02:f92a/64 scope link
             valid_lft forever preferred_lft forever
      [ROUTE]local fe80::24f8:84ff:fe02:f92a via :: dev lo  table local  proto none  metric 0  mtu 16436 advmss 16376 hoplimit 0
      [ROUTE]default via fe80::215:e9ff:fef0:10f8 dev macvlan0  proto kernel  metric 1024  mtu 1500 advmss 1440 hoplimit 0
      [NEIGH]fe80::215:e9ff:fef0:10f8 dev macvlan0 lladdr 00:15:e9:f0:10:f8 router STALE
      [ROUTE]2001:6f8:974::/64 dev macvlan0  proto kernel  metric 256  expires 0sec mtu 1500 advmss 1440 hoplimit 0
      [PREFIX]prefix 2001:6f8:974::/64 dev macvlan0 onlink autoconf valid 14400 preferred 131084
      [ADDR]11: macvlan0    inet6 2001:6f8:974:0:24f8:84ff:fe02:f92a/64 scope global dynamic
             valid_lft 86399sec preferred_lft 14399sec
      
      # add VLAN to eth1, eth1 is down
      #
      $ ip link add link eth1 up type vlan id 1000
      RTNETLINK answers: Network is down
      
      <no events>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3729d502
    • P
      dev: support deferring device flag change notifications · bd380811
      Patrick McHardy 提交于
      Split dev_change_flags() into two functions: __dev_change_flags() to
      perform the actual changes and __dev_notify_flags() to invoke netdevice
      notifiers. This will be used by rtnl_link to defer netlink notifications
      until the device has been fully configured.
      
      This changes ordering of some operations, in particular:
      
      - netlink notifications are sent after all changes have been performed.
        As a side effect this surpresses one unnecessary netlink message when
        the IFF_UP and other flags are changed simultaneously.
      
      - The NETDEV_UP/NETDEV_DOWN and NETDEV_CHANGE notifiers are invoked
        after all changes have been performed. Their relative is unchanged.
      
      - net_dmaengine_put() is invoked before the NETDEV_DOWN notifier instead
        of afterwards. This should not make any difference since both RX and TX
        are already shut down at this point.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd380811
    • P
      rtnetlink: handle rtnl_link netlink notifications manually · a2835763
      Patrick McHardy 提交于
      In order to support specifying device flags during device creation,
      we must be able to roll back device registration in case setting the
      flags fails without sending any notifications related to the device
      to userspace.
      
      This patch changes rollback_registered_many() and register_netdevice()
      to manually send netlink notifications for devices not handled by
      rtnl_link and allows to defer notifications for devices handled by
      rtnl_link until setup is complete.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2835763
    • P
      rtnetlink: ignore NETDEV_PRE_UP notifier in rtnetlink_event() · 10de05af
      Patrick McHardy 提交于
      Commit 3b8bcfd5 (net: introduce pre-up netdev notifier) added a new
      notifier which is run before a device is set UP for use by cfg80211.
      
      The patch missed to add the new notifier to the ignore list in
      rtnetlink_event(), so we currently get an unnecessary netlink
      notification before a device is set UP.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10de05af
  20. 26 2月, 2010 1 次提交
  21. 25 2月, 2010 1 次提交
    • P
      net: Add checking to rcu_dereference() primitives · a898def2
      Paul E. McKenney 提交于
      Update rcu_dereference() primitives to use new lockdep-based
      checking. The rcu_dereference() in __in6_dev_get() may be
      protected either by rcu_read_lock() or RTNL, per Eric Dumazet.
      The rcu_dereference() in __sk_free() is protected by the fact
      that it is never reached if an update could change it.  Check
      for this by using rcu_dereference_check() to verify that the
      struct sock's ->sk_wmem_alloc counter is zero.
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-5-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a898def2
  22. 13 2月, 2010 1 次提交
  23. 18 1月, 2010 1 次提交
  24. 14 12月, 2009 1 次提交
    • E
      net: Fix userspace RTM_NEWLINK notifications. · d90a909e
      Eric W. Biederman 提交于
      I received some bug reports about userspace programs having problems
      because after RTM_NEWLINK was received they could not immediate access
      files under /proc/sys/net/ because they had not been registered yet.
      
      The original problem was trivially fixed by moving the userspace
      notification from rtnetlink_event() to the end of
      register_netdevice().
      
      When testing that change I discovered I was still getting RTM_NEWLINK
      events before I could access proc and I was also getting RTM_NEWLINK
      events after I was seeing RTM_DELLINK.  Things practically guaranteed
      to confuse userspace.
      
      After a little more investigation these extra notifications proved to
      be from the new notifiers NETDEV_POST_INIT and NETDEV_UNREGISTER_BATCH
      hitting the default case in rtnetlink_event, and triggering
      unnecessary RTM_NEWLINK messages.
      
      rtnetlink_event now explicitly handles NETDEV_UNREGISTER_BATCH and
      NETDEV_POST_INIT to avoid sending the incorrect userspace
      notifications.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d90a909e
  25. 08 11月, 2009 1 次提交
    • E
      net: Support specifying the network namespace upon device creation. · 81adee47
      Eric W. Biederman 提交于
      There is no good reason to not support userspace specifying the
      network namespace during device creation, and it makes it easier
      to create a network device and pass it to a child network namespace
      with a well known name.
      
      We have to be careful to ensure that the target network namespace
      for the new device exists through the life of the call.  To keep
      that logic clear I have factored out the network namespace grabbing
      logic into rtnl_link_get_net.
      
      In addtion we need to continue to pass the source network namespace
      to the rtnl_link_ops.newlink method so that we can find the base
      device source network namespace.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      81adee47
  26. 07 11月, 2009 1 次提交
  27. 28 10月, 2009 1 次提交
  28. 24 10月, 2009 1 次提交
  29. 22 10月, 2009 1 次提交
  30. 06 9月, 2009 1 次提交
    • P
      net_sched: reintroduce dev->qdisc for use by sch_api · af356afa
      Patrick McHardy 提交于
      Currently the multiqueue integration with the qdisc API suffers from
      a few problems:
      
      - with multiple queues, all root qdiscs use the same handle. This means
        they can't be exposed to userspace in a backwards compatible fashion.
      
      - all API operations always refer to queue number 0. Newly created
        qdiscs are automatically shared between all queues, its not possible
        to address individual queues or restore multiqueue behaviour once a
        shared qdisc has been attached.
      
      - Dumps only contain the root qdisc of queue 0, in case of non-shared
        qdiscs this means the statistics are incomplete.
      
      This patch reintroduces dev->qdisc, which points to the (single) root qdisc
      from userspace's point of view. Currently it either points to the first
      (non-shared) default qdisc, or a qdisc shared between all queues. The
      following patches will introduce a classful dummy qdisc, which will be used
      as root qdisc and contain the per-queue qdiscs as children.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af356afa
  31. 03 9月, 2009 1 次提交
  32. 13 7月, 2009 1 次提交
  33. 25 2月, 2009 1 次提交
    • P
      netlink: change nlmsg_notify() return value logic · 1ce85fe4
      Pablo Neira Ayuso 提交于
      This patch changes the return value of nlmsg_notify() as follows:
      
      If NETLINK_BROADCAST_ERROR is set by any of the listeners and
      an error in the delivery happened, return the broadcast error;
      else if there are no listeners apart from the socket that
      requested a change with the echo flag, return the result of the
      unicast notification. Thus, with this patch, the unicast
      notification is handled in the same way of a broadcast listener
      that has set the NETLINK_BROADCAST_ERROR socket flag.
      
      This patch is useful in case that the caller of nlmsg_notify()
      wants to know the result of the delivery of a netlink notification
      (including the broadcast delivery) and take any action in case
      that the delivery failed. For example, ctnetlink can drop packets
      if the event delivery failed to provide reliable logging and
      state-synchronization at the cost of dropping packets.
      
      This patch also modifies the rtnetlink code to ignore the return
      value of rtnl_notify() in all callers. The function rtnl_notify()
      (before this patch) returned the error of the unicast notification
      which makes rtnl_set_sk_err() reports errors to all listeners. This
      is not of any help since the origin of the change (the socket that
      requested the echoing) notices the ENOBUFS error if the notification
      fails and should resync itself.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ce85fe4
  34. 20 11月, 2008 2 次提交
    • S
      netdev: introduce dev_get_stats() · eeda3fd6
      Stephen Hemminger 提交于
      In order for the network device ops get_stats call to be immutable, the handling
      of the default internal network device stats block has to be changed. Add a new
      helper function which replaces the old use of internal_get_stats.
      
      Note: change return code to make it clear that the caller should not
      go changing the returned statistics.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeda3fd6
    • S
      netdev: network device operations infrastructure · d314774c
      Stephen Hemminger 提交于
      This patch changes the network device internal API to move adminstrative
      operations out of the network device structure and into a separate structure.
      
      This patch involves some hackery to maintain compatablity between the
      new and old model, so all 300+ drivers don't have to be changed at once.
      For drivers that aren't converted yet, the netdevice_ops virt function list
      still resides in the net_device structure. For old protocols, the new
      net_device_ops are copied out to the old net_device pointers.
      
      After the transistion is completed the nag message can be changed to
      an WARN_ON, and the compatiablity code can be made configurable.
      
      Some function pointers aren't moved:
      * destructor can't be in net_device_ops because
        it may need to be referenced after the module is unloaded.
      * neighbor setup is manipulated in a couple of places that need special
        consideration
      * hard_start_xmit is in the fast path for transmit.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d314774c