1. 11 7月, 2014 1 次提交
  2. 02 7月, 2014 1 次提交
  3. 13 6月, 2014 1 次提交
    • M
      rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 · e5eca6d4
      Michal Schmidt 提交于
      When running RHEL6 userspace on a current upstream kernel, "ip link"
      fails to show VF information.
      
      The reason is a kernel<->userspace API change introduced by commit
      88c5b5ce ("rtnetlink: Call nlmsg_parse() with correct header length"),
      after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
      in the netlink request.
      
      iproute2 adjusted for the API change in its commit 63338dca4513
      ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").
      
      The problem has been noticed before:
      http://marc.info/?l=linux-netdev&m=136692296022182&w=2
      (Subject: Re: getting VF link info seems to be broken in 3.9-rc8)
      
      We can do better than tell those with old userspace to upgrade. We can
      recognize the old iproute2 in the kernel by checking the netlink message
      length. Even when including the IFLA_EXT_MASK attribute, its netlink
      message is shorter than struct ifinfomsg.
      
      With this patch "ip link" shows VF information in both old and new
      iproute2 versions.
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5eca6d4
  4. 12 6月, 2014 1 次提交
  5. 09 6月, 2014 1 次提交
  6. 04 6月, 2014 1 次提交
  7. 24 5月, 2014 1 次提交
    • S
      net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. · ed616689
      Sucheta Chakraborty 提交于
      o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
        to have a bandwidth of at least this value.
        max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
        of up to this value.
      
      o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
        which takes 4 arguments:
        netdev, VF number, min_tx_rate, max_tx_rate
      
      o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.
      
      o Drivers that currently implement ndo_set_vf_tx_rate should now call
        ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
        greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
        implemented by driver.
      
      o If user enters only one of either min_tx_rate or max_tx_rate, then,
        userland should read back the other value from driver and set both
        for IFLA_VF_RATE.
        Drivers that have not yet implemented IFLA_VF_RATE should always
        return min_tx_rate as 0 when read from ip tool.
      
      o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
        IFLA_VF_RATE should override.
      
      o Idea is to have consistent display of rate values to user.
      
      o Usage example: -
      
        ./ip link set p4p1 vf 0 rate 900
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed616689
  8. 16 5月, 2014 1 次提交
  9. 25 4月, 2014 3 次提交
    • D
      rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is set · c53864fd
      David Gibson 提交于
      Since 115c9b81 (rtnetlink: Fix problem with
      buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST
      attribute if they were solicited by a GETLINK message containing an
      IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag.
      
      That was done because some user programs broke when they received more data
      than expected - because IFLA_VFINFO_LIST contains information for each VF
      it can become large if there are many VFs.
      
      However, the IFLA_VF_PORTS attribute, supplied for devices which implement
      ndo_get_vf_port (currently the 'enic' driver only), has the same problem.
      It supplies per-VF information and can therefore become large, but it is
      not currently conditional on the IFLA_EXT_MASK value.
      
      Worse, it interacts badly with the existing EXT_MASK handling.  When
      IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at
      NLMSG_GOODSIZE.  If the information for IFLA_VF_PORTS exceeds this, then
      rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet.
      netlink_dump() will misinterpret this as having finished the listing and
      omit data for this interface and all subsequent ones.  That can cause
      getifaddrs(3) to enter an infinite loop.
      
      This patch addresses the problem by only supplying IFLA_VF_PORTS when
      IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c53864fd
    • D
      rtnetlink: Warn when interface's information won't fit in our packet · 973462bb
      David Gibson 提交于
      Without IFLA_EXT_MASK specified, the information reported for a single
      interface in response to RTM_GETLINK is expected to fit within a netlink
      packet of NLMSG_GOODSIZE.
      
      If it doesn't, however, things will go badly wrong,  When listing all
      interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
      message in a packet as the end of the listing and omit information for
      that interface and all subsequent ones.  This can cause getifaddrs(3) to
      enter an infinite loop.
      
      This patch won't fix the problem, but it will WARN_ON() making it easier to
      track down what's going wrong.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      973462bb
    • E
      net: Use netlink_ns_capable to verify the permisions of netlink messages · 90f62cf3
      Eric W. Biederman 提交于
      It is possible by passing a netlink socket to a more privileged
      executable and then to fool that executable into writing to the socket
      data that happens to be valid netlink message to do something that
      privileged executable did not intend to do.
      
      To keep this from happening replace bare capable and ns_capable calls
      with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
      Which act the same as the previous calls except they verify that the
      opener of the socket had the desired permissions as well.
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90f62cf3
  10. 01 4月, 2014 1 次提交
  11. 21 3月, 2014 1 次提交
  12. 19 2月, 2014 1 次提交
  13. 14 2月, 2014 1 次提交
    • C
      net: correct error path in rtnl_newlink() · 0e0eee24
      Cong Wang 提交于
      I saw the following BUG when ->newlink() fails in rtnl_newlink():
      
      [   40.240058] kernel BUG at net/core/dev.c:6438!
      
      this is due to free_netdev() is not supposed to be called before
      netdev is completely unregistered, therefore it is not correct
      to call free_netdev() here, at least for ops->newlink!=NULL case,
      many drivers call it in ->destructor so that rtnl_unlock() will
      take care of it, we probably don't need to do anything here.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e0eee24
  14. 05 2月, 2014 1 次提交
    • F
      rtnetlink: fix oops in rtnl_link_get_slave_info_data_size · 6049f253
      Fernando Luis Vazquez Cao 提交于
      We should check whether rtnetlink link operations
      are defined before calling get_slave_size().
      
      Without this, the following oops can occur when
      adding a tap device to OVS.
      
      [   87.839553] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      [   87.839595] IP: [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220
      [...]
      [   87.840651] Call Trace:
      [   87.840664]  [<ffffffff813d694b>] ? rtmsg_ifinfo+0x2b/0x100
      [   87.840688]  [<ffffffff813c8340>] ? __netdev_adjacent_dev_insert+0x150/0x1a0
      [   87.840718]  [<ffffffff813d6a50>] ? rtnetlink_event+0x30/0x40
      [   87.840742]  [<ffffffff814b4144>] ? notifier_call_chain+0x44/0x70
      [   87.840768]  [<ffffffff813c8946>] ? __netdev_upper_dev_link+0x3c6/0x3f0
      [   87.840798]  [<ffffffffa0678d6c>] ? netdev_create+0xcc/0x160 [openvswitch]
      [   87.840828]  [<ffffffffa06781ea>] ? ovs_vport_add+0x4a/0xd0 [openvswitch]
      [   87.840857]  [<ffffffffa0670139>] ? new_vport+0x9/0x50 [openvswitch]
      [   87.840884]  [<ffffffffa067279e>] ? ovs_vport_cmd_new+0x11e/0x210 [openvswitch]
      [   87.840915]  [<ffffffff813f3efa>] ? genl_family_rcv_msg+0x19a/0x360
      [   87.840941]  [<ffffffff813f40c0>] ? genl_family_rcv_msg+0x360/0x360
      [   87.840967]  [<ffffffff813f4139>] ? genl_rcv_msg+0x79/0xc0
      [   87.840991]  [<ffffffff813b6cf9>] ? __kmalloc_reserve.isra.25+0x29/0x80
      [   87.841018]  [<ffffffff813f2389>] ? netlink_rcv_skb+0xa9/0xc0
      [   87.841042]  [<ffffffff813f27cf>] ? genl_rcv+0x1f/0x30
      [   87.841064]  [<ffffffff813f1988>] ? netlink_unicast+0xe8/0x1e0
      [   87.841088]  [<ffffffff813f1d9a>] ? netlink_sendmsg+0x31a/0x750
      [   87.841113]  [<ffffffff813aee96>] ? sock_sendmsg+0x86/0xc0
      [   87.841136]  [<ffffffff813c960d>] ? __netdev_update_features+0x4d/0x200
      [   87.841163]  [<ffffffff813ca94e>] ? ethtool_get_value+0x2e/0x50
      [   87.841188]  [<ffffffff813af269>] ? ___sys_sendmsg+0x359/0x370
      [   87.841212]  [<ffffffff813da686>] ? dev_ioctl+0x1a6/0x5c0
      [   87.841236]  [<ffffffff8109c210>] ? autoremove_wake_function+0x30/0x30
      [   87.841264]  [<ffffffff813ac59d>] ? sock_do_ioctl+0x3d/0x50
      [   87.841288]  [<ffffffff813aca68>] ? sock_ioctl+0x1e8/0x2c0
      [   87.841312]  [<ffffffff811934bf>] ? do_vfs_ioctl+0x2cf/0x4b0
      [   87.841335]  [<ffffffff813afeb9>] ? __sys_sendmsg+0x39/0x70
      [   87.841362]  [<ffffffff814b86f9>] ? system_call_fastpath+0x16/0x1b
      [   87.841386] Code: c0 74 10 48 89 ef ff d0 83 c0 07 83 e0 fc 48 98 49 01 c7 48 89 ef e8 d0 d6 fe ff 48 85 c0 0f 84 df 00 00 00 48 8b 90 08 07 00 00 <48> 8b 8a a8 00 00 00 31 d2 48 85 c9 74 0c 48 89 ee 48 89 c7 ff
      [   87.841529] RIP  [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220
      [   87.841555]  RSP <ffff880221aa5950>
      [   87.841569] CR2: 00000000000000a8
      [   87.851442] ---[ end trace e42ab217691b4fc2 ]---
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6049f253
  15. 24 1月, 2014 1 次提交
  16. 23 1月, 2014 3 次提交
  17. 18 1月, 2014 1 次提交
  18. 02 1月, 2014 1 次提交
  19. 26 10月, 2013 1 次提交
    • A
      net: fix rtnl notification in atomic context · 7f294054
      Alexei Starovoitov 提交于
      commit 991fb3f7 "dev: always advertise rx_flags changes via netlink"
      introduced rtnl notification from __dev_set_promiscuity(),
      which can be called in atomic context.
      
      Steps to reproduce:
      ip tuntap add dev tap1 mode tap
      ifconfig tap1 up
      tcpdump -nei tap1 &
      ip tuntap del dev tap1 mode tap
      
      [  271.627994] device tap1 left promiscuous mode
      [  271.639897] BUG: sleeping function called from invalid context at mm/slub.c:940
      [  271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip
      [  271.677525] INFO: lockdep is turned off.
      [  271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G        W    3.12.0-rc3+ #73
      [  271.703996] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [  271.731254]  ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5 ffff88082fa0f428
      [  271.760261]  ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1 ffffffff81110ff8
      [  271.790683]  0000000000000010 00000000000000d0 00000000000000d0 ffff8807f0d57af8
      [  271.822332] Call Trace:
      [  271.838234]  [<ffffffff817544e5>] dump_stack+0x55/0x76
      [  271.854446]  [<ffffffff8108bad1>] __might_sleep+0x181/0x240
      [  271.870836]  [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0
      [  271.887076]  [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0
      [  271.903368]  [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0
      [  271.919716]  [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0
      [  271.936088]  [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0
      [  271.952504]  [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0
      [  271.968902]  [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100
      [  271.985302]  [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0
      [  272.001642]  [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0
      [  272.017917]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
      [  272.033961]  [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50
      [  272.049855]  [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0
      [  272.065494]  [<ffffffff81732052>] packet_notifier+0x1b2/0x380
      [  272.080915]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
      [  272.096009]  [<ffffffff81761c66>] notifier_call_chain+0x66/0x150
      [  272.110803]  [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10
      [  272.125468]  [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20
      [  272.139984]  [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70
      [  272.154523]  [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20
      [  272.168552]  [<ffffffff816224c5>] rollback_registered_many+0x145/0x240
      [  272.182263]  [<ffffffff81622641>] rollback_registered+0x31/0x40
      [  272.195369]  [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90
      [  272.208230]  [<ffffffff81547ca0>] __tun_detach+0x140/0x340
      [  272.220686]  [<ffffffff81547ed6>] tun_chr_close+0x36/0x60
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f294054
  20. 01 10月, 2013 1 次提交
  21. 15 8月, 2013 1 次提交
  22. 14 8月, 2013 1 次提交
    • A
      rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header · 3e805ad2
      Asbjoern Sloth Toennesen 提交于
      Fix the iproute2 command `bridge vlan show`, after switching from
      rtgenmsg to ifinfomsg.
      
      Let's start with a little history:
      
      Feb 20:   Vlad Yasevich got his VLAN-aware bridge patchset included in
                the 3.9 merge window.
                In the kernel commit 6cbdceeb, he added attribute support to
                bridge GETLINK requests sent with rtgenmsg.
      
      Mar 6th:  Vlad got this iproute2 reference implementation of the bridge
                vlan netlink interface accepted (iproute2 9eff0e5c)
      
      Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
                http://patchwork.ozlabs.org/patch/239602/
                http://marc.info/?t=136680900700007
      
      Apr 28th: Linus released 3.9
      
      Apr 30th: Stephen released iproute2 3.9.0
      
      The `bridge vlan show` command haven't been working since the switch to
      ifinfomsg, or in a released version of iproute2. Since the kernel side
      only supports rtgenmsg, which iproute2 switched away from just prior to
      the iproute2 3.9.0 release.
      
      I haven't been able to find any documentation, about neither rtgenmsg
      nor ifinfomsg, and in which situation to use which, but kernel commit
      88c5b5ce seams to suggest that ifinfomsg should be used.
      
      Fixing this in kernel will break compatibility, but I doubt that anybody
      have been using it due to this bug in the user space reference
      implementation, at least not without noticing this bug. That said the
      functionality is still fully functional in 3.9, when reversing iproute2
      commit 63338dca.
      
      This could also be fixed in iproute2, but thats an ugly patch that would
      reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
      like rtgenmsg usage is discouraged. I'm assuming that the only reason
      that Vlad implemented the kernel side to use rtgenmsg, was because
      iproute2 was using it at the time.
      Signed-off-by: NAsbjoern Sloth Toennesen <ast@fiberby.net>
      Reviewed-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e805ad2
  23. 10 8月, 2013 1 次提交
  24. 31 7月, 2013 1 次提交
  25. 26 6月, 2013 1 次提交
  26. 14 6月, 2013 1 次提交
  27. 29 5月, 2013 1 次提交
  28. 25 4月, 2013 1 次提交
  29. 09 4月, 2013 1 次提交
  30. 30 3月, 2013 1 次提交
  31. 29 3月, 2013 1 次提交
  32. 28 3月, 2013 1 次提交
  33. 25 3月, 2013 1 次提交
  34. 22 3月, 2013 1 次提交
  35. 18 3月, 2013 1 次提交
    • D
      vxlan: generalize forwarding tables · 6681712d
      David Stevens 提交于
      This patch generalizes VXLAN forwarding table entries allowing an administrator
      to:
      	1) specify multiple destinations for a given MAC
      	2) specify alternate vni's in the VXLAN header
      	3) specify alternate destination UDP ports
      	4) use multicast MAC addresses as fdb lookup keys
      	5) specify multicast destinations
      	6) specify the outgoing interface for forwarded packets
      
      The combination allows configuration of more complex topologies using VXLAN
      encapsulation.
      
      Changes since v1: rebase to 3.9.0-rc2
      Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6681712d
  36. 17 3月, 2013 1 次提交