1. 14 2月, 2013 4 次提交
  2. 03 1月, 2013 1 次提交
  3. 30 12月, 2012 1 次提交
  4. 20 12月, 2012 1 次提交
  5. 11 12月, 2012 1 次提交
  6. 06 12月, 2012 1 次提交
    • D
      bridge: implement multicast fast leave · c2d3babf
      David S. Miller 提交于
      V3: make it a flag
      V2: make the toggle per-port
      
      Fast leave allows bridge to immediately stops the multicast
      traffic on the port receives IGMP Leave when IGMP snooping is enabled,
      no timeouts are observed.
      
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      c2d3babf
  7. 19 11月, 2012 2 次提交
  8. 15 11月, 2012 3 次提交
    • S
      bridge: add root port blocking · 1007dd1a
      stephen hemminger 提交于
      This is Linux bridge implementation of root port guard.
      If BPDU is received from a leaf (edge) port, it should not
      be elected as root port.
      
      Why would you want to do this?
      If using STP on a bridge and the downstream bridges are not fully
      trusted; this prevents a hostile guest for rerouting traffic.
      
      Why not just use netfilter?
      Netfilter does not track of follow spanning tree decisions.
      It would be difficult and error prone to try and mirror STP
      resolution in netfilter module.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1007dd1a
    • S
      bridge: implement BPDU blocking · a2e01a65
      stephen hemminger 提交于
      This is Linux bridge implementation of STP protection
      (Cisco BPDU guard/Juniper BPDU block). BPDU block disables
      the bridge port if a STP BPDU packet is received.
      
      Why would you want to do this?
      If running Spanning Tree on bridge, hostile devices on the network
      may send BPDU and cause network failure. Enabling bpdu block
      will detect and stop this.
      
      How to recover the port?
      The port will be restarted if link is brought down, or
      removed and reattached.  For example:
       # ip li set dev eth0 down; ip li set dev eth0 up
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e01a65
    • S
      bridge: bridge port parameters over netlink · 25c71c75
      stephen hemminger 提交于
      Expose bridge port parameter over netlink. By switching to a nested
      message, this can be used for other bridge parameters.
      
      This changes IFLA_PROTINFO attribute from one byte to a full nested
      set of attributes. This is safe for application interface because the
      old message used IFLA_PROTINFO and new one uses
       IFLA_PROTINFO | NLA_F_NESTED.
      
      The code adapts to old format requests, and therefore stays
      compatible with user mode RSTP daemon. Since the type field
      for nested and unnested attributes are different, and the old
      code in libnetlink doesn't do the mask, it is also safe to use
      with old versions of bridge monitor command.
      
      Note: although mode is only a boolean, treating it as a
      full byte since in the future someone will probably want to add more
      values (like macvlan has).
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25c71c75
  9. 01 11月, 2012 2 次提交
    • J
      net: set and query VEB/VEPA bridge mode via PF_BRIDGE · 2469ffd7
      John Fastabend 提交于
      Hardware switches may support enabling and disabling the
      loopback switch which puts the device in a VEPA mode defined
      in the IEEE 802.1Qbg specification. In this mode frames are
      not switched in the hardware but sent directly to the switch.
      SR-IOV capable NICs will likely support this mode I am
      aware of at least two such devices. Also I am told (but don't
      have any of this hardware available) that there are devices
      that only support VEPA modes. In these cases it is important
      at a minimum to be able to query these attributes.
      
      This patch adds an additional IFLA_BRIDGE_MODE attribute that can be
      set and dumped via the PF_BRIDGE:{SET|GET}LINK operations. Also
      anticipating bridge attributes that may be common for both embedded
      bridges and software bridges this adds a flags attribute
      IFLA_BRIDGE_FLAGS currently used to determine if the command or event
      is being generated to/from an embedded bridge or software bridge.
      Finally, the event generation is pulled out of the bridge module and
      into rtnetlink proper.
      
      For example using the macvlan driver in VEPA mode on top of
      an embedded switch requires putting the embedded switch into
      a VEPA mode to get the expected results.
      
      	--------  --------
              | VEPA |  | VEPA |       <-- macvlan vepa edge relays
              --------  --------
                 |        |
                 |        |
              ------------------
              |      VEPA      |       <-- embedded switch in NIC
              ------------------
                      |
                      |
              -------------------
              | external switch |      <-- shiny new physical
      	-------------------          switch with VEPA support
      
      A packet sent from the macvlan VEPA at the top could be
      loopbacked on the embedded switch and never seen by the
      external switch. So in order for this to work the embedded
      switch needs to be set in the VEPA state via the above
      described commands.
      
      By making these attributes nested in IFLA_AF_SPEC we allow
      future extensions to be made as needed.
      
      CC: Lennert Buytenhek <buytenh@wantstofly.org>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2469ffd7
    • J
      net: create generic bridge ops · e5a55a89
      John Fastabend 提交于
      The PF_BRIDGE:RTM_{GET|SET}LINK nlmsg family and type are
      currently embedded in the ./net/bridge module. This prohibits
      them from being used by other bridging devices. One example
      of this being hardware that has embedded bridging components.
      
      In order to use these nlmsg types more generically this patch
      adds two net_device_ops hooks. One to set link bridge attributes
      and another to dump the current bride attributes.
      
      	ndo_bridge_setlink()
      	ndo_bridge_getlink()
      
      CC: Lennert Buytenhek <buytenh@wantstofly.org>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5a55a89
  10. 11 9月, 2012 1 次提交
  11. 27 6月, 2012 1 次提交
  12. 16 4月, 2012 2 次提交
    • J
      net: add generic PF_BRIDGE:RTM_ FDB hooks · 77162022
      John Fastabend 提交于
      This adds two new flags NTF_MASTER and NTF_SELF that can
      now be used to specify where PF_BRIDGE netlink commands should
      be sent. NTF_MASTER sends the commands to the 'dev->master'
      device for parsing. Typically this will be the linux net/bridge,
      or open-vswitch devices. Also without any flags set the command
      will be handled by the master device as well so that current user
      space tools continue to work as expected.
      
      The NTF_SELF flag will push the PF_BRIDGE commands to the
      device. In the basic example below the commands are then parsed
      and programmed in the embedded bridge.
      
      Note if both NTF_SELF and NTF_MASTER bits are set then the
      command will be sent to both 'dev->master' and 'dev' this allows
      user space to easily keep the embedded bridge and software bridge
      in sync.
      
      There is a slight complication in the case with both flags set
      when an error occurs. To resolve this the rtnl handler clears
      the NTF_ flag in the netlink ack to indicate which sets completed
      successfully. The add/del handlers will abort as soon as any
      error occurs.
      
      To support this new net device ops were added to call into
      the device and the existing bridging code was refactored
      to use these. There should be no required changes in user space
      to support the current bridge behavior.
      
      A basic setup with a SR-IOV enabled NIC looks like this,
      
                veth0  veth2
                  |      |
                ------------
                |  bridge0 |   <---- software bridging
                ------------
                     /
                     /
        ethx.y      ethx
          VF         PF
           \         \          <---- propagate FDB entries to HW
           \         \
        --------------------
        |  Embedded Bridge |    <---- hardware offloaded switching
        --------------------
      
      In this case the embedded bridge must be managed to allow 'veth0'
      to communicate with 'ethx.y' correctly. At present drivers managing
      the embedded bridge either send frames onto the network which
      then get dropped by the switch OR the embedded bridge will flood
      these frames. With this patch we have a mechanism to manage the
      embedded bridge correctly from user space. This example is specific
      to SR-IOV but replacing the VF with another PF or dropping this
      into the DSA framework generates similar management issues.
      
      Examples session using the 'br'[1] tool to add, dump and then
      delete a mac address with a new "embedded" option and enabled
      ixgbe driver:
      
      # br fdb add 22:35:19:ac:60:59 dev eth3
      # br fdb
      port    mac addr                flags
      veth0   22:35:19:ac:60:58       static
      veth0   9a:5f:81:f7:f6:ec       local
      eth3    00:1b:21:55:23:59       local
      eth3    22:35:19:ac:60:59       static
      veth0   22:35:19:ac:60:57       static
      #br fdb add 22:35:19:ac:60:59 embedded dev eth3
      #br fdb
      port    mac addr                flags
      veth0   22:35:19:ac:60:58       static
      veth0   9a:5f:81:f7:f6:ec       local
      eth3    00:1b:21:55:23:59       local
      eth3    22:35:19:ac:60:59       static
      veth0   22:35:19:ac:60:57       static
      eth3    22:35:19:ac:60:59       local embedded
      #br fdb del 22:35:19:ac:60:59 embedded dev eth3
      
      I added a couple lines to 'br' to set the flags correctly is all. It
      is my opinion that the merit of this patch is now embedded and SW
      bridges can both be modeled correctly in user space using very nearly
      the same message passing.
      
      [1] 'br' tool was published as an RFC here and will be renamed 'bridge'
          http://patchwork.ozlabs.org/patch/117664/
      
      Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
      valuable feedback, suggestions, and review.
      
      v2: fixed api descriptions and error case with both NTF_SELF and
          NTF_MASTER set plus updated patch description.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77162022
    • E
      net: cleanup unsigned to unsigned int · 95c96174
      Eric Dumazet 提交于
      Use of "unsigned int" is preferred to bare "unsigned" in net tree.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95c96174
  13. 02 4月, 2012 1 次提交
  14. 02 12月, 2011 1 次提交
  15. 19 10月, 2011 1 次提交
  16. 23 7月, 2011 1 次提交
  17. 10 6月, 2011 1 次提交
    • G
      rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679
      Greg Rose 提交于
      The message size allocated for rtnl ifinfo dumps was limited to
      a single page.  This is not enough for additional interface info
      available with devices that support SR-IOV and caused a bug in
      which VF info would not be displayed if more than approximately
      40 VFs were created per interface.
      
      Implement a new function pointer for the rtnl_register service that will
      calculate the amount of data required for the ifinfo dump and allocate
      enough data to satisfy the request.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c7ac8679
  18. 03 5月, 2011 1 次提交
    • E
      net: dont hold rtnl mutex during netlink dump callbacks · e67f88dd
      Eric Dumazet 提交于
      Four years ago, Patrick made a change to hold rtnl mutex during netlink
      dump callbacks.
      
      I believe it was a wrong move. This slows down concurrent dumps, making
      good old /proc/net/ files faster than rtnetlink in some situations.
      
      This occurred to me because one "ip link show dev ..." was _very_ slow
      on a workload adding/removing network devices in background.
      
      All dump callbacks are able to use RCU locking now, so this patch does
      roughly a revert of commits :
      
      1c2d670f : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
      6313c1e0 : [RTNETLINK]: Remove unnecessary locking in dump callbacks
      
      This let writers fight for rtnl mutex and readers going full speed.
      
      It also takes care of phonet : phonet_route_get() is now called from rcu
      read section. I renamed it to phonet_route_get_rcu()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e67f88dd
  19. 05 4月, 2011 3 次提交
  20. 16 11月, 2010 2 次提交
  21. 16 6月, 2010 1 次提交
  22. 16 5月, 2010 1 次提交
  23. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  24. 25 2月, 2009 1 次提交
    • P
      netlink: change nlmsg_notify() return value logic · 1ce85fe4
      Pablo Neira Ayuso 提交于
      This patch changes the return value of nlmsg_notify() as follows:
      
      If NETLINK_BROADCAST_ERROR is set by any of the listeners and
      an error in the delivery happened, return the broadcast error;
      else if there are no listeners apart from the socket that
      requested a change with the echo flag, return the result of the
      unicast notification. Thus, with this patch, the unicast
      notification is handled in the same way of a broadcast listener
      that has set the NETLINK_BROADCAST_ERROR socket flag.
      
      This patch is useful in case that the caller of nlmsg_notify()
      wants to know the result of the delivery of a netlink notification
      (including the broadcast delivery) and take any action in case
      that the delivery failed. For example, ctnetlink can drop packets
      if the event delivery failed to provide reliable logging and
      state-synchronization at the cost of dropping packets.
      
      This patch also modifies the rtnetlink code to ignore the return
      value of rtnl_notify() in all callers. The function rtnl_notify()
      (before this patch) returned the error of the unicast notification
      which makes rtnl_set_sk_err() reports errors to all listeners. This
      is not of any help since the origin of the change (the socket that
      requested the echoing) notices the ENOBUFS error if the notification
      fails and should resync itself.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ce85fe4
  25. 09 9月, 2008 1 次提交
  26. 26 3月, 2008 1 次提交
  27. 29 1月, 2008 2 次提交
  28. 11 10月, 2007 1 次提交
    • E
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman 提交于
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      881d966b