1. 02 2月, 2015 1 次提交
  2. 30 1月, 2015 1 次提交
    • N
      rtnetlink: pass link_net to the newlink handler · 7b4ce694
      Nicolas Dichtel 提交于
      When IFLA_LINK_NETNSID is used, the netdevice should be built in this link netns
      and moved at the end to another netns (pointed by the socket netns or
      IFLA_NET_NS_[PID|FD]).
      
      Existing user of the newlink handler will use the netns argument (src_net) to
      find a link netdevice or to check some other information into the link netns.
      For example, to find a netdevice, two information are required: an ifindex
      (usually from IFLA_LINK) and a netns (this link netns).
      
      Note: when using IFLA_LINK_NETNSID and IFLA_NET_NS_[PID|FD], a user may create a
      netdevice that stands in netnsX and with its link part in netnsY, by sending a
      rtnl message from netnsZ.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b4ce694
  3. 24 1月, 2015 1 次提交
  4. 20 1月, 2015 2 次提交
  5. 19 1月, 2015 2 次提交
  6. 18 1月, 2015 2 次提交
    • J
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg 提交于
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      053c095a
    • R
      bridge: fix setlink/dellink notifications · 02dba438
      Roopa Prabhu 提交于
      problems with bridge getlink/setlink notifications today:
              - bridge setlink generates two notifications to userspace
                      - one from the bridge driver
                      - one from rtnetlink.c (rtnl_bridge_notify)
              - dellink generates one notification from rtnetlink.c. Which
      	means bridge setlink and dellink notifications are not
      	consistent
      
              - Looking at the code it appears,
      	If both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF were set,
              the size calculation in rtnl_bridge_notify can be wrong.
              Example: if you set both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF
              in a setlink request to rocker dev, rtnl_bridge_notify will
      	allocate skb for one set of bridge attributes, but,
      	both the bridge driver and rocker dev will try to add
      	attributes resulting in twice the number of attributes
      	being added to the skb.  (rocker dev calls ndo_dflt_bridge_getlink)
      
      There are multiple options:
      1) Generate one notification including all attributes from master and self:
         But, I don't think it will work, because both master and self may use
         the same attributes/policy. Cannot pack the same set of attributes in a
         single notification from both master and slave (duplicate attributes).
      
      2) Generate one notification from master and the other notification from
         self (This seems to be ideal):
           For master: the master driver will send notification (bridge in this
      	example)
           For self: the self driver will send notification (rocker in the above
      	example. It can use helpers from rtnetlink.c to do so. Like the
      	ndo_dflt_bridge_getlink api).
      
      This patch implements 2) (leaving the 'rtnl_bridge_notify' around to be used
      with 'self').
      
      v1->v2 :
      	- rtnl_bridge_notify is now called only for self,
      	so, remove 'BRIDGE_FLAGS_SELF' check and cleanup a few things
      	- rtnl_bridge_dellink used to always send a RTM_NEWLINK msg
      	earlier. So, I have changed the notification from br_dellink to
      	go as RTM_NEWLINK
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02dba438
  7. 06 1月, 2015 2 次提交
    • D
      net: tcp: add RTAX_CC_ALGO fib handling · ea697639
      Daniel Borkmann 提交于
      This patch adds the minimum necessary for the RTAX_CC_ALGO congestion
      control metric to be set up and dumped back to user space.
      
      While the internal representation of RTAX_CC_ALGO is handled as a u32
      key, we avoided to expose this implementation detail to user space, thus
      instead, we chose the netlink attribute that is being exchanged between
      user space to be the actual congestion control algorithm name, similarly
      as in the setsockopt(2) API in order to allow for maximum flexibility,
      even for 3rd party modules.
      
      It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as
      it should have been stored in RTAX_FEATURES instead, we first thought
      about reusing it for the congestion control key, but it brings more
      complications and/or confusion than worth it.
      
      Joint work with Florian Westphal.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea697639
    • H
      net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined · 6cb69742
      Hubert Sokolowski 提交于
      Add checking whether the call to ndo_dflt_fdb_dump is needed.
      It is not expected to call ndo_dflt_fdb_dump unconditionally
      by some drivers (i.e. qlcnic or macvlan) that defines
      own ndo_fdb_dump. Other drivers define own ndo_fdb_dump
      and don't want ndo_dflt_fdb_dump to be called at all.
      At the same time it is desirable to call the default dump
      function on a bridge device.
      Fix attributes that are passed to dev->netdev_ops->ndo_fdb_dump.
      Add extra checking in br_fdb_dump to avoid duplicate entries
      as now filter_dev can be NULL.
      
      Following tests for filtering have been performed before
      the change and after the patch was applied to make sure
      they are the same and it doesn't break the filtering algorithm.
      
      [root@localhost ~]# cd /root/iproute2-3.18.0/bridge
      [root@localhost bridge]# modprobe dummy
      [root@localhost bridge]# ./bridge fdb add f1:f2:f3:f4:f5:f6 dev dummy0
      [root@localhost bridge]# brctl addbr br0
      [root@localhost bridge]# brctl addif  br0 dummy0
      [root@localhost bridge]# ip link set dev br0 address 02:00:00:12:01:04
      [root@localhost bridge]# # show all
      [root@localhost bridge]# ./bridge fdb show
      33:33:00:00:00:01 dev p2p1 self permanent
      01:00:5e:00:00:01 dev p2p1 self permanent
      33:33:ff:ac:ce:32 dev p2p1 self permanent
      33:33:00:00:02:02 dev p2p1 self permanent
      01:00:5e:00:00:fb dev p2p1 self permanent
      33:33:00:00:00:01 dev p7p1 self permanent
      01:00:5e:00:00:01 dev p7p1 self permanent
      33:33:ff:79:50:53 dev p7p1 self permanent
      33:33:00:00:02:02 dev p7p1 self permanent
      01:00:5e:00:00:fb dev p7p1 self permanent
      f2:46:50:85:6d:d9 dev dummy0 master br0 permanent
      f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      f1:f2:f3:f4:f5:f6 dev dummy0 self permanent
      33:33:00:00:00:01 dev br0 self permanent
      02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent
      02:00:00:12:01:04 dev br0 master br0 permanent
      [root@localhost bridge]# # filter by bridge
      [root@localhost bridge]# ./bridge fdb show br br0
      f2:46:50:85:6d:d9 dev dummy0 master br0 permanent
      f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      f1:f2:f3:f4:f5:f6 dev dummy0 self permanent
      33:33:00:00:00:01 dev br0 self permanent
      02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent
      02:00:00:12:01:04 dev br0 master br0 permanent
      [root@localhost bridge]# # filter by port
      [root@localhost bridge]# ./bridge fdb show brport dummy0
      f2:46:50:85:6d:d9 master br0 permanent
      f2:46:50:85:6d:d9 vlan 1 master br0 permanent
      33:33:00:00:00:01 self permanent
      f1:f2:f3:f4:f5:f6 self permanent
      [root@localhost bridge]# # filter by port + bridge
      [root@localhost bridge]# ./bridge fdb show br br0 brport dummy0
      f2:46:50:85:6d:d9 master br0 permanent
      f2:46:50:85:6d:d9 vlan 1 master br0 permanent
      33:33:00:00:00:01 self permanent
      f1:f2:f3:f4:f5:f6 self permanent
      [root@localhost bridge]#
      Signed-off-by: NHubert Sokolowski <hubert.sokolowski@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6cb69742
  8. 17 12月, 2014 1 次提交
  9. 10 12月, 2014 2 次提交
    • R
      rocker: remove swdev mode · 1d460b98
      Roopa Prabhu 提交于
      Remove use of 'swdev' mode in rocker. rocker dev offloads
      can use the BRIDGE_FLAGS_SELF to indicate offload to hardware.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d460b98
    • M
      rtnetlink: delay RTM_DELLINK notification until after ndo_uninit() · 395eea6c
      Mahesh Bandewar 提交于
      The commit 56bfa7ee ("unregister_netdevice : move RTM_DELLINK to
      until after ndo_uninit") tried to do this ealier but while doing so
      it created a problem. Unfortunately the delayed rtmsg_ifinfo() also
      delayed call to fill_info(). So this translated into asking driver
      to remove private state and then query it's private state. This
      could have catastropic consequences.
      
      This change breaks the rtmsg_ifinfo() into two parts - one takes the
      precise snapshot of the device by called fill_info() before calling
      the ndo_uninit() and the second part sends the notification using
      collected snapshot.
      
      It was brought to notice when last link is deleted from an ipvlan device
      when it has free-ed the port and the subsequent .fill_info() call is
      trying to get the info from the port.
      
      kernel: [  255.139429] ------------[ cut here ]------------
      kernel: [  255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110()
      kernel: [  255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c
      kernel: [  255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167
      kernel: [  255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012
      kernel: [  255.139515]  0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0
      kernel: [  255.139516]  0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000
      kernel: [  255.139518]  00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011
      kernel: [  255.139520] Call Trace:
      kernel: [  255.139527]  [<ffffffff815d87f4>] dump_stack+0x46/0x58
      kernel: [  255.139531]  [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0
      kernel: [  255.139540]  [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20
      kernel: [  255.139544]  [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110
      kernel: [  255.139547]  [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0
      kernel: [  255.139549]  [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0
      kernel: [  255.139551]  [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110
      kernel: [  255.139553]  [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240
      kernel: [  255.139557]  [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80
      kernel: [  255.139558]  [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20
      kernel: [  255.139562]  [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0
      kernel: [  255.139563]  [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40
      kernel: [  255.139565]  [<ffffffff8152c398>] netlink_unicast+0x178/0x230
      kernel: [  255.139567]  [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420
      kernel: [  255.139571]  [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0
      kernel: [  255.139575]  [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130
      kernel: [  255.139577]  [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0
      kernel: [  255.139578]  [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310
      kernel: [  255.139581]  [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0
      kernel: [  255.139585]  [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70
      kernel: [  255.139589]  [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500
      kernel: [  255.139597]  [<ffffffff811e8336>] ? dput+0xb6/0x190
      kernel: [  255.139606]  [<ffffffff811f05f6>] ? mntput+0x26/0x40
      kernel: [  255.139611]  [<ffffffff811d2b94>] ? __fput+0x174/0x1e0
      kernel: [  255.139613]  [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90
      kernel: [  255.139615]  [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20
      kernel: [  255.139617]  [<ffffffff815df092>] system_call_fastpath+0x12/0x17
      kernel: [  255.139619] ---[ end trace 5e6703e87d984f6b ]---
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Reported-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: David S. Miller <davem@davemloft.net>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      395eea6c
  10. 03 12月, 2014 4 次提交
  11. 30 11月, 2014 1 次提交
  12. 27 11月, 2014 2 次提交
  13. 04 11月, 2014 1 次提交
  14. 03 9月, 2014 4 次提交
  15. 09 8月, 2014 1 次提交
  16. 17 7月, 2014 1 次提交
  17. 16 7月, 2014 2 次提交
  18. 11 7月, 2014 2 次提交
    • J
      bridge: netlink dump interface at par with brctl · 5e6d2435
      Jamal Hadi Salim 提交于
      Actually better than brctl showmacs because we can filter by bridge
      port in the kernel.
      The current bridge netlink interface doesnt scale when you have many
      bridges each with large fdbs or even bridges with many bridge ports
      
      And now for the science non-fiction novel you have all been
      waiting for..
      
      //lets see what bridge ports we have
      root@moja-1:/configs/may30-iprt/bridge# ./bridge link show
      8: eth1 state DOWN : <BROADCAST,MULTICAST> mtu 1500 master br0 state
      disabled priority 32 cost 19
      17: sw1-p1 state DOWN : <BROADCAST,NOARP> mtu 1500 master br0 state
      disabled priority 32 cost 100
      
      // show all..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show
      33:33:00:00:00:01 dev bond0 self permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      33:33:00:00:00:01 dev ifb0 self permanent
      33:33:00:00:00:01 dev ifb1 self permanent
      33:33:00:00:00:01 dev eth0 self permanent
      01:00:5e:00:00:01 dev eth0 self permanent
      33:33:ff:22:01:01 dev eth0 self permanent
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      33:33:00:00:00:01 dev gretap0 self permanent
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      //filter by bridge
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      // bridge sw1 has no ports attached..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br sw1
      
      //filter by port
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show brport eth1
      02:00:00:12:01:02 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 self permanent
      33:33:00:00:00:01 self permanent
      
      // filter by port + bridge
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0 brport
      sw1-p1
      da:ac:46:27:d9:53 vlan 0 master br0 permanent
      33:33:00:00:00:01 self permanent
      
      // for shits and giggles (as they say in New Brunswick), lets
      // change the mac that br0 uses
      // Note: a magical fdb entry with no brport is added ...
      root@moja-1:/configs/may30-iprt/bridge# ip link set dev br0 address
      02:00:00:12:01:04
      
      // lets see if we can see the unicorn ..
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show
      33:33:00:00:00:01 dev bond0 self permanent
      33:33:00:00:00:01 dev dummy0 self permanent
      33:33:00:00:00:01 dev ifb0 self permanent
      33:33:00:00:00:01 dev ifb1 self permanent
      33:33:00:00:00:01 dev eth0 self permanent
      01:00:5e:00:00:01 dev eth0 self permanent
      33:33:ff:22:01:01 dev eth0 self permanent
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      33:33:00:00:00:01 dev gretap0 self permanent
      02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      
      //can we see it if we filter by bridge?
      root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0
      02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
      00:17:42:8a:b4:07 dev eth1 self permanent
      33:33:00:00:00:01 dev eth1 self permanent
      02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is
      da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent
      33:33:00:00:00:01 dev sw1-p1 self permanent
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e6d2435
    • J
      bridge: fdb dumping takes a filter device · 5d5eacb3
      Jamal Hadi Salim 提交于
      Dumping a bridge fdb dumps every fdb entry
      held. With this change we are going to filter
      on selected bridge port.
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d5eacb3
  19. 02 7月, 2014 1 次提交
  20. 13 6月, 2014 1 次提交
    • M
      rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 · e5eca6d4
      Michal Schmidt 提交于
      When running RHEL6 userspace on a current upstream kernel, "ip link"
      fails to show VF information.
      
      The reason is a kernel<->userspace API change introduced by commit
      88c5b5ce ("rtnetlink: Call nlmsg_parse() with correct header length"),
      after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
      in the netlink request.
      
      iproute2 adjusted for the API change in its commit 63338dca4513
      ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").
      
      The problem has been noticed before:
      http://marc.info/?l=linux-netdev&m=136692296022182&w=2
      (Subject: Re: getting VF link info seems to be broken in 3.9-rc8)
      
      We can do better than tell those with old userspace to upgrade. We can
      recognize the old iproute2 in the kernel by checking the netlink message
      length. Even when including the IFLA_EXT_MASK attribute, its netlink
      message is shorter than struct ifinfomsg.
      
      With this patch "ip link" shows VF information in both old and new
      iproute2 versions.
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5eca6d4
  21. 12 6月, 2014 1 次提交
  22. 09 6月, 2014 1 次提交
  23. 04 6月, 2014 1 次提交
  24. 24 5月, 2014 1 次提交
    • S
      net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. · ed616689
      Sucheta Chakraborty 提交于
      o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
        to have a bandwidth of at least this value.
        max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
        of up to this value.
      
      o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
        which takes 4 arguments:
        netdev, VF number, min_tx_rate, max_tx_rate
      
      o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.
      
      o Drivers that currently implement ndo_set_vf_tx_rate should now call
        ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
        greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
        implemented by driver.
      
      o If user enters only one of either min_tx_rate or max_tx_rate, then,
        userland should read back the other value from driver and set both
        for IFLA_VF_RATE.
        Drivers that have not yet implemented IFLA_VF_RATE should always
        return min_tx_rate as 0 when read from ip tool.
      
      o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
        IFLA_VF_RATE should override.
      
      o Idea is to have consistent display of rate values to user.
      
      o Usage example: -
      
        ./ip link set p4p1 vf 0 rate 900
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed616689
  25. 16 5月, 2014 1 次提交
  26. 25 4月, 2014 1 次提交
    • D
      rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is set · c53864fd
      David Gibson 提交于
      Since 115c9b81 (rtnetlink: Fix problem with
      buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST
      attribute if they were solicited by a GETLINK message containing an
      IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag.
      
      That was done because some user programs broke when they received more data
      than expected - because IFLA_VFINFO_LIST contains information for each VF
      it can become large if there are many VFs.
      
      However, the IFLA_VF_PORTS attribute, supplied for devices which implement
      ndo_get_vf_port (currently the 'enic' driver only), has the same problem.
      It supplies per-VF information and can therefore become large, but it is
      not currently conditional on the IFLA_EXT_MASK value.
      
      Worse, it interacts badly with the existing EXT_MASK handling.  When
      IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at
      NLMSG_GOODSIZE.  If the information for IFLA_VF_PORTS exceeds this, then
      rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet.
      netlink_dump() will misinterpret this as having finished the listing and
      omit data for this interface and all subsequent ones.  That can cause
      getifaddrs(3) to enter an infinite loop.
      
      This patch addresses the problem by only supplying IFLA_VF_PORTS when
      IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c53864fd