1. 22 5月, 2015 1 次提交
    • S
      rocker: do not delete fdb entries in rocker_port_fdb_flush() when preparing transactions · 3098ac39
      Simon Horman 提交于
      rocker_port_fdb_flush() is called by rocker_port_stp_update() which in
      turn may be called with trans == SWITCHDEV_TRANS_PREPARE and then
      trans == SWITCHDEV_TRANS_COMMIT from switchdev_port_attr_set() via
      br_set_state().
      
      When rocker_port_fdb_flush() is called with trans == SWITCHDEV_TRANS_PREPARE
      it calls rocker_port_fdb_learn() for each entry in the FDB table which in
      turn calls rocker_flow_tbl_bridge() which will allocate memory using
      rocker_port_kzalloc(). rocker_port_fdb_learn() will then remove the entry
      from the FDB table.
      
      Then when rocker_port_fdb_learn() is called with
      trans == SWITCHDEV_TRANS_PREPARE no calls are made to rocker_port_fdb_learn()
      because there are no longer any entries present in the FDB table. Thus the
      memory previously allocated by rocker_port_fdb_learn() is leaked resulting
      in the kernel BUG() below.
      
      Furthermore, it looks like the driver ends up with an incorrect view of the
      fdb table as the FDB entries are purged from the driver's table but not the
      hardware's table.
      
      ip link add br0 type bridge
      ip link set up dev eth0
      sleep 1
      ip link set dev eth0 master br0
      [    3.704360] ------------[ cut here ]------------
      [    3.704611] kernel BUG at drivers/net/ethernet/rocker/rocker.c:4289!
      [    3.704962] invalid opcode: 0000 [#1] SMP
      [    3.705537] Modules linked in:
      [    3.705919] CPU: 0 PID: 63 Comm: ip Not tainted 4.1.0-rc3-01046-gb9fbe709 #1044
      [    3.706191] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org 04/01/2014
      [    3.706820] task: ffff880019f70150 ti: ffff88001f92c000 task.ti: ffff88001f92c000
      [    3.707138] RIP: 0010:[<ffffffff811f0080>]  [<ffffffff811f0080>] rocker_port_attr_set+0xe0/0xf0
      [    3.707990] RSP: 0018:ffff88001f92f808  EFLAGS: 00000212
      [    3.708200] RAX: ffff880019d4fa68 RBX: ffff880019d4f000 RCX: 0000000000000000
      [    3.708471] RDX: 000000000000000c RSI: ffff88001f92f890 RDI: ffff880019d4f680
      [    3.708740] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000004
      [    3.708999] R10: ffff880000034024 R11: 0000000000000000 R12: ffff88001f92f890
      [    3.709276] R13: ffff88001f8f1c00 R14: 000000000000000b R15: 0000000000000000
      [    3.709303] FS:  00007f8ab66bd700(0000) GS:ffff88001b000000(0000) knlGS:0000000000000000
      [    3.709303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    3.709303] CR2: 0000000000654988 CR3: 000000001f8f3000 CR4: 00000000000006b0
      [    3.709303] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    3.709303] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
      [    3.709303] Stack:
      [    3.709303]  ffff88001f8f1c00 000000000000000b ffff88001f92f890 ffff880019d4f000
      [    3.709303]  ffff88001f92f890 ffffffff813332f5 ffff88001f92f880 0000000000000000
      [    3.709303]  ffff88001f92f890 0000000000000001 ffff880019d4f000 ffffffff81333627
      [    3.709303] Call Trace:
      [    3.709303]  [<ffffffff813332f5>] ? __switchdev_port_attr_set+0x25/0x90
      [    3.709303]  [<ffffffff81333627>] ? switchdev_port_attr_set+0x27/0x120
      [    3.709303]  [<ffffffff81318e86>] ? br_set_state+0x36/0x50
      [    3.709303]  [<ffffffff8131795c>] ? br_add_if+0x37c/0x400
      [    3.709303]  [<ffffffff81238ce1>] ? do_setlink+0x7e1/0x800
      [    3.709303]  [<ffffffff8111f980>] ? radix_tree_lookup_slot+0x10/0x30
      [    3.709303]  [<ffffffff81136fba>] ? nla_parse+0xaa/0x110
      [    3.709303]  [<ffffffff81239c98>] ? rtnl_newlink+0x548/0x870
      [    3.709303]  [<ffffffff8111f900>] ? __radix_tree_lookup+0x40/0xb0
      [    3.709303]  [<ffffffff81136f3e>] ? nla_parse+0x2e/0x110
      [    3.709303]  [<ffffffff81237d7e>] ? rtnetlink_rcv_msg+0x7e/0x250
      [    3.709303]  [<ffffffff8121d1be>] ? __skb_recv_datagram+0xfe/0x4b0
      [    3.709303]  [<ffffffff81237d00>] ? rtnetlink_rcv+0x30/0x30
      [    3.709303]  [<ffffffff81247948>] ? netlink_rcv_skb+0xa8/0xd0
      [    3.709303]  [<ffffffff81237cef>] ? rtnetlink_rcv+0x1f/0x30
      [    3.709303]  [<ffffffff81247210>] ? netlink_unicast+0x150/0x200
      [    3.709303]  [<ffffffff81247704>] ? netlink_sendmsg+0x374/0x3e0
      [    3.709303]  [<ffffffff8120f8cf>] ? sock_sendmsg+0xf/0x30
      [    3.709303]  [<ffffffff8120ffc3>] ? ___sys_sendmsg+0x1f3/0x200
      [    3.709303]  [<ffffffff812100d5>] ? ___sys_recvmsg+0x105/0x140
      [    3.709303]  [<ffffffff812228d9>] ? dev_get_by_name_rcu+0x69/0x90
      [    3.709303]  [<ffffffff812228d9>] ? dev_get_by_name_rcu+0x69/0x90
      [    3.709303]  [<ffffffff81217b7d>] ? skb_dequeue+0x4d/0x60
      [    3.709303]  [<ffffffff81217bb0>] ? skb_queue_purge+0x20/0x30
      [    3.709303]  [<ffffffff810ebdcf>] ? __inode_wait_for_writeback+0x5f/0xb0
      [    3.709303]  [<ffffffff810648b0>] ? autoremove_wake_function+0x30/0x30
      [    3.709303]  [<ffffffff81210ee9>] ? __sys_sendmsg+0x39/0x70
      [    3.709303]  [<ffffffff8133e097>] ? system_call_fastpath+0x12/0x6a
      [    3.709303] Code: bb 90 06 00 00 48 c7 04 24 00 00 00 00 45 31 c9 45 31 c0 48 c7 c1 c0 b7 1e 81 89 ea e8 da da ff ff eb 95 0f 1f 84 00 00 00 00 00 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 fe 15 75
      [    3.709303] RIP  [<ffffffff811f0080>] rocker_port_attr_set+0xe0/0xf0
      [    3.709303]  RSP <ffff88001f92f808>
      [    3.721409] ---[ end trace b7481fcb7cb032aa ]---
      Segmentation fault
      
      Fixes: c4f20321 ("rocker: support prepare-commit transaction model")
      Acked-by: NScott Feldman <sfeldma@gmail.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3098ac39
  2. 18 5月, 2015 1 次提交
  3. 16 5月, 2015 1 次提交
  4. 14 5月, 2015 2 次提交
  5. 13 5月, 2015 13 次提交
  6. 02 5月, 2015 1 次提交
  7. 30 4月, 2015 1 次提交
    • N
      bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da
      Nicolas Dichtel 提交于
      NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
      it is sent only at the end of a dump.
      
      Libraries like libnl will wait forever for NLMSG_DONE.
      
      Fixes: e5a55a89 ("net: create generic bridge ops")
      Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
      CC: John Fastabend <john.r.fastabend@intel.com>
      CC: Sathya Perla <sathya.perla@emulex.com>
      CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
      CC: Ajit Khaparde <ajit.khaparde@emulex.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: bridge@lists.linux-foundation.org
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46c264da
  8. 17 4月, 2015 1 次提交
  9. 25 3月, 2015 1 次提交
  10. 19 3月, 2015 1 次提交
  11. 17 3月, 2015 1 次提交
  12. 16 3月, 2015 1 次提交
  13. 10 3月, 2015 1 次提交
  14. 07 3月, 2015 3 次提交
  15. 06 3月, 2015 1 次提交
    • S
      rocker: implement IPv4 fib offloading · c1beeef7
      Scott Feldman 提交于
      The driver implements ndo_switch_fib_ipv4_add/del ops to add/del/mod IPv4
      routes to/from switchdev device.  Once a route is added to the device, and the
      route's nexthops are resolved to neighbor MAC address, the device will forward
      matching pkts rather than the kernel.  This offloads the L3 forwarding path
      from the kernel to the device.  Note that control and management planes are
      still mananged by Linux; only the data plane is offloaded.  Standard routing
      control protocols such as OSPF and BGP run on Linux and manage the kernel's FIB
      via standard rtm netlink msgs...nothing changes here.
      
      A new hash table is added to rocker to track neighbors.  The driver listens for
      neighbor updates events using netevent notifier NETEVENT_NEIGH_UPDATE.  Any ARP
      table updates for ports on this device are recorded in this table.  Routes
      installed to the device with nexthops that reference neighbors in this table
      are "qualified".  In the case of a route with nexthops not resolved in the
      table, the kernel is asked to resolve the nexthop.
      
      The driver uses fib_info->fib_priority for the priority field in rocker's
      unicast routing table.
      
      The device can only forward to pkts matching route dst to resolved nexthops.
      Currently, the device only supports single-path routes (i.e. routes with one
      nexthop).  Equal Cost Multipath (ECMP) route support will be added in followup
      patches.
      
      This patch is driver support for unicast IPv4 routing only.  Followup patches
      will add driver and infrastructure for IPv6 routing and multicast routing.
      Signed-off-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1beeef7
  16. 28 2月, 2015 2 次提交
  17. 27 2月, 2015 3 次提交
  18. 02 2月, 2015 3 次提交
  19. 18 1月, 2015 2 次提交
    • D
      net: rocker: Add basic netdev counters - v2 · f2bbca51
      David Ahern 提交于
      Add packet and byte counters for RX and TX paths.
      
      $ ifconfig eth1
      eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet6 fe80::5054:ff:fe12:3501  prefixlen 64  scopeid 0x20<link>
              ether 52:54:00:12:35:01  txqueuelen 1000  (Ethernet)
              RX packets 63  bytes 15813 (15.4 KiB)
              RX errors 1  dropped 0  overruns 0  frame 0
              TX packets 79  bytes 17991 (17.5 KiB)
              TX errors 7  dropped 0 overruns 0  carrier 0  collisions 0
      
      Rx / Tx errors tested by injecting faults in qemu's hardware model for Rocker.
      
      v2:
      - moved counter locations to avoid potential use after free per Florian's comment
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Scott Feldman <sfeldma@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NScott Feldman <sfeldma@gmail.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2bbca51
    • J
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg 提交于
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      053c095a