1. 01 8月, 2015 1 次提交
  2. 22 7月, 2015 1 次提交
    • T
      vxlan: Flow based tunneling · ee122c79
      Thomas Graf 提交于
      Allows putting a VXLAN device into a new flow-based mode in which
      skbs with a ip_tunnel_info dst metadata attached will be encapsulated
      according to the instructions stored in there with the VXLAN device
      defaults taken into consideration.
      
      Similar on the receive side, if the VXLAN_F_COLLECT_METADATA flag is
      set, the packet processing will populate a ip_tunnel_info struct for
      each packet received and attach it to the skb using the new metadata
      dst.  The metadata structure will contain the outer header and tunnel
      header fields which have been stripped off. Layers further up in the
      stack such as routing, tc or netfitler can later match on these fields
      and perform forwarding. It is the responsibility of upper layers to
      ensure that the flag is set if the metadata is needed. The flag limits
      the additional cost of metadata collecting based on demand.
      
      This prepares the VXLAN device to be steered by the routing and other
      subsystems which allows to support encapsulation for a large number
      of tunnel endpoints and tunnel ids through a single net_device which
      improves the scalability.
      
      It also allows for OVS to leverage this mode which in turn allows for
      the removal of the OVS specific VXLAN code.
      
      Because the skb is currently scrubed in vxlan_rcv(), the attachment of
      the new dst metadata is postponed until after scrubing which requires
      the temporary addition of a new member to vxlan_metadata. This member
      is removed again in a later commit after the indirect VXLAN receive API
      has been removed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee122c79
  3. 16 7月, 2015 1 次提交
  4. 16 6月, 2015 3 次提交
  5. 02 6月, 2015 2 次提交
  6. 14 5月, 2015 1 次提交
  7. 11 5月, 2015 1 次提交
  8. 11 4月, 2015 1 次提交
  9. 24 3月, 2015 1 次提交
    • H
      ipv6: generation of stable privacy addresses for link-local and autoconf · 622c81d5
      Hannes Frederic Sowa 提交于
      This patch implements the stable privacy address generation for
      link-local and autoconf addresses as specified in RFC7217.
      
        RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
      
      is the RID (random identifier). As the hash function F we chose one
      round of sha1. Prefix will be either the link-local prefix or the
      router advertised one. As Net_Iface we use the MAC address of the
      device. DAD_Counter and secret_key are implemented as specified.
      
      We don't use Network_ID, as it couples the code too closely to other
      subsystems. It is specified as optional in the RFC.
      
      As Net_Iface we only use the MAC address: we simply have no stable
      identifier in the kernel we could possibly use: because this code might
      run very early, we cannot depend on names, as they might be changed by
      user space early on during the boot process.
      
      A new address generation mode is introduced,
      IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
      none or eui64 address configuration mode although the stable_secret is
      already set.
      
      We refuse writes to ipv6/conf/all/stable_secret but only allow
      ipv6/conf/default/stable_secret and the interface specific file to be
      written to. The default stable_secret is used as the parameter for the
      namespace, the interface specific can overwrite the secret, e.g. when
      switching a network configuration from one system to another while
      inheriting the secret.
      
      Cc: Erik Kline <ek@google.com>
      Cc: Fernando Gont <fgont@si6networks.com>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      622c81d5
  10. 19 3月, 2015 2 次提交
  11. 06 3月, 2015 1 次提交
    • J
      bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi · 842a9ae0
      Jouni Malinen 提交于
      This extends the design in commit 95850116 ("bridge: Add support for
      IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
      meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
      previously added BR_PROXYARP behavior is left as-is and a new
      BR_PROXYARP_WIFI alternative is added so that this behavior can be
      configured from user space when required.
      
      In addition, this enables proxyarp functionality for unicast ARP
      requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
      to use unicast as well as broadcast for these frames.
      
      The key differences in functionality:
      
      BR_PROXYARP:
      - uses the flag on the bridge port on which the request frame was
        received to determine whether to reply
      - block bridge port flooding completely on ports that enable proxy ARP
      
      BR_PROXYARP_WIFI:
      - uses the flag on the bridge port to which the target device of the
        request belongs
      - block bridge port flooding selectively based on whether the proxyarp
        functionality replied
      Signed-off-by: NJouni Malinen <jouni@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      842a9ae0
  12. 12 2月, 2015 1 次提交
  13. 20 1月, 2015 1 次提交
  14. 15 1月, 2015 2 次提交
    • T
      vxlan: Group Policy extension · 3511494c
      Thomas Graf 提交于
      Implements supports for the Group Policy VXLAN extension [0] to provide
      a lightweight and simple security label mechanism across network peers
      based on VXLAN. The security context and associated metadata is mapped
      to/from skb->mark. This allows further mapping to a SELinux context
      using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
      tc, etc.
      
      The group membership is defined by the lower 16 bits of skb->mark, the
      upper 16 bits are used for flags.
      
      SELinux allows to manage label to secure local resources. However,
      distributed applications require ACLs to implemented across hosts. This
      is typically achieved by matching on L2-L4 fields to identify the
      original sending host and process on the receiver. On top of that,
      netlabel and specifically CIPSO [1] allow to map security contexts to
      universal labels.  However, netlabel and CIPSO are relatively complex.
      This patch provides a lightweight alternative for overlay network
      environments with a trusted underlay. No additional control protocol
      is required.
      
                 Host 1:                       Host 2:
      
            Group A        Group B        Group B     Group A
            +-----+   +-------------+    +-------+   +-----+
            | lxc |   | SELinux CTX |    | httpd |   | VM  |
            +--+--+   +--+----------+    +---+---+   +--+--+
      	  \---+---/                     \----+---/
      	      |                              |
      	  +---+---+                      +---+---+
      	  | vxlan |                      | vxlan |
      	  +---+---+                      +---+---+
      	      +------------------------------+
      
      Backwards compatibility:
      A VXLAN-GBP socket can receive standard VXLAN frames and will assign
      the default group 0x0000 to such frames. A Linux VXLAN socket will
      drop VXLAN-GBP  frames. The extension is therefore disabled by default
      and needs to be specifically enabled:
      
         ip link add [...] type vxlan [...] gbp
      
      In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
      must run on a separate port number.
      
      Examples:
       iptables:
        host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
        host2# iptables -I INPUT -m mark --mark 0x200 -j DROP
      
       OVS:
        # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL'
        # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop'
      
      [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy
      [1] http://lwn.net/Articles/204905/Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3511494c
    • T
      vxlan: Remote checksum offload · dfd8645e
      Tom Herbert 提交于
      Add support for remote checksum offload in VXLAN. This uses a
      reserved bit to indicate that RCO is being done, and uses the low order
      reserved eight bits of the VNI to hold the start and offset values in a
      compressed manner.
      
      Start is encoded in the low order seven bits of VNI. This is start >> 1
      so that the checksum start offset is 0-254 using even values only.
      Checksum offset (transport checksum field) is indicated in the high
      order bit in the low order byte of the VNI. If the bit is set, the
      checksum field is for UDP (so offset = start + 6), else checksum
      field is for TCP (so offset = start + 16). Only TCP and UDP are
      supported in this implementation.
      
      Remote checksum offload for VXLAN is described in:
      
      https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
      
      Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).
      
      With UDP checksums and Remote Checksum Offload
        IPv4
            Client
              11.84% CPU utilization
            Server
              12.96% CPU utilization
            9197 Mbps
        IPv6
            Client
              12.46% CPU utilization
            Server
              14.48% CPU utilization
            8963 Mbps
      
      With UDP checksums, no remote checksum offload
        IPv4
            Client
              15.67% CPU utilization
            Server
              14.83% CPU utilization
            9094 Mbps
        IPv6
            Client
              16.21% CPU utilization
            Server
              14.32% CPU utilization
            9058 Mbps
      
      No UDP checksums
        IPv4
            Client
              15.03% CPU utilization
            Server
              23.09% CPU utilization
            9089 Mbps
        IPv6
            Client
              16.18% CPU utilization
            Server
              26.57% CPU utilization
             8954 Mbps
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfd8645e
  15. 03 12月, 2014 2 次提交
  16. 25 11月, 2014 1 次提交
    • M
      ipvlan: Initial check-in of the IPVLAN driver. · 2ad7bf36
      Mahesh Bandewar 提交于
      This driver is very similar to the macvlan driver except that it
      uses L3 on the frame to determine the logical interface while
      functioning as packet dispatcher. It inherits L2 of the master
      device hence the packets on wire will have the same L2 for all
      the packets originating from all virtual devices off of the same
      master device.
      
      This driver was developed keeping the namespace use-case in
      mind. Hence most of the examples given here take that as the
      base setup where main-device belongs to the default-ns and
      virtual devices are assigned to the additional namespaces.
      
      The device operates in two different modes and the difference
      in these two modes in primarily in the TX side.
      
      (a) L2 mode : In this mode, the device behaves as a L2 device.
      TX processing upto L2 happens on the stack of the virtual device
      associated with (namespace). Packets are switched after that
      into the main device (default-ns) and queued for xmit.
      
      RX processing is simple and all multicast, broadcast (if
      applicable), and unicast belonging to the address(es) are
      delivered to the virtual devices.
      
      (b) L3 mode : In this mode, the device behaves like a L3 device.
      TX processing upto L3 happens on the stack of the virtual device
      associated with (namespace). Packets are switched to the
      main-device (default-ns) for the L2 processing. Hence the routing
      table of the default-ns will be used in this mode.
      
      RX processins is somewhat similar to the L2 mode except that in
      this mode only Unicast packets are delivered to the virtual device
      while main-dev will handle all other packets.
      
      The devices can be added using the "ip" command from the iproute2
      package -
      
      	ip link add link <master> <virtual> type ipvlan mode [ l2 | l3 ]
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Laurent Chavey <chavey@google.com>
      Cc: Tim Hockin <thockin@google.com>
      Cc: Brandon Philips <brandon.philips@coreos.com>
      Cc: Pavel Emelianov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ad7bf36
  17. 28 10月, 2014 1 次提交
    • K
      bridge: Add support for IEEE 802.11 Proxy ARP · 95850116
      Kyeyoon Park 提交于
      This feature is defined in IEEE Std 802.11-2012, 10.23.13. It allows
      the AP devices to keep track of the hardware-address-to-IP-address
      mapping of the mobile devices within the WLAN network.
      
      The AP will learn this mapping via observing DHCP, ARP, and NS/NA
      frames. When a request for such information is made (i.e. ARP request,
      Neighbor Solicitation), the AP will respond on behalf of the
      associated mobile device. In the process of doing so, the AP will drop
      the multicast request frame that was intended to go out to the wireless
      medium.
      
      It was recommended at the LKS workshop to do this implementation in
      the bridge layer. vxlan.c is already doing something very similar.
      The DHCP snooping code will be added to the userspace application
      (hostapd) per the recommendation.
      
      This RFC commit is only for IPv4. A similar approach in the bridge
      layer will be taken for IPv6 as well.
      Signed-off-by: NKyeyoon Park <kyeyoonp@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95850116
  18. 30 9月, 2014 1 次提交
    • M
      macvlan: add source mode · 79cf79ab
      Michael Braun 提交于
      This patch adds a new mode of operation to macvlan, called "source".
      It allows one to set a list of allowed mac address, which is used
      to match against source mac address from received frames on underlying
      interface.
      This enables creating mac based VLAN associations, instead of standard
      port or tag based. The feature is useful to deploy 802.1x mac based
      behavior, where drivers of underlying interfaces doesn't allows that.
      
      Configuration is done through the netlink interface using e.g.:
       ip link add link eth0 name macvlan0 type macvlan mode source
       ip link add link eth0 name macvlan1 type macvlan mode source
       ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
       ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22
       ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44
      
      This allows clients with MAC addresses 00:11:11:11:11:11,
      00:22:22:22:22:22 to be part of only VLAN associated with macvlan0
      interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN
      associated with macvlan1 interface. And client with MAC address
      00:33:33:33:33:33 to be associated with both VLANs.
      
      Based on work of Stefan Gula <steweg@gmail.com>
      
      v8: last version of Stefan Gula for Kernel 3.2.1
      v9: rework onto linux-next 2014-03-12 by Michael Braun
          add MACADDR_SET command, enable to configure mac for source mode
          while creating interface
      v10:
        - reduce indention level
        - rename source_list to source_entry
        - use aligned 64bit ether address
        - use hash_64 instead of addr[5]
      v11:
        - rebase for 3.14 / linux-next 20.04.2014
      v12
        - rebase for linux-next 2014-09-25
      Signed-off-by: NMichael Braun <michael-dev@fami-braun.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79cf79ab
  19. 10 9月, 2014 1 次提交
  20. 12 7月, 2014 1 次提交
    • J
      ipv6: addrconf: implement address generation modes · bc91b0f0
      Jiri Pirko 提交于
      This patch introduces a possibility for userspace to set various (so far
      two) modes of generating addresses. This is useful for example for
      NetworkManager because it can set the mode to NONE and take care of link
      local addresses itself. That allow it to have the interface up,
      monitoring carrier but still don't have any addresses on it.
      
      One more use-case by Dan Williams:
      <quote>
      WWAN devices often have their LL address provided by the firmware of the
      device, which sometimes refuses to respond to incorrect LL addresses
      when doing DHCPv6 or IPv6 ND.  The kernel cannot generate the correct LL
      address for two reasons:
      
      1) WWAN pseudo-ethernet interfaces often construct a fake MAC address,
      or read a meaningless MAC address from the firmware.  Thus the EUI64 and
      the IPv6LL address the kernel assigns will be wrong.  The real LL
      address is often retrieved from the firmware with AT or proprietary
      commands.
      
      2) WWAN PPP interfaces receive their LL address from IPV6CP, not from
      kernel assignments.  Only after IPV6CP has completed do we know the LL
      address of the PPP interface and its peer.  But the kernel has already
      assigned an incorrect LL address to the interface.
      
      So being able to suppress the kernel LL address generation and assign
      the one retrieved from the firmware is less complicated and more robust.
      </quote>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc91b0f0
  21. 05 6月, 2014 1 次提交
  22. 24 5月, 2014 1 次提交
    • S
      net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. · ed616689
      Sucheta Chakraborty 提交于
      o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
        to have a bandwidth of at least this value.
        max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
        of up to this value.
      
      o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
        which takes 4 arguments:
        netdev, VF number, min_tx_rate, max_tx_rate
      
      o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.
      
      o Drivers that currently implement ndo_set_vf_tx_rate should now call
        ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
        greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
        implemented by driver.
      
      o If user enters only one of either min_tx_rate or max_tx_rate, then,
        userland should read back the other value from driver and set both
        for IFLA_VF_RATE.
        Drivers that have not yet implemented IFLA_VF_RATE should always
        return min_tx_rate as 0 when read from ip tool.
      
      o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
        IFLA_VF_RATE should override.
      
      o Idea is to have consistent display of rate values to user.
      
      o Usage example: -
      
        ./ip link set p4p1 vf 0 rate 900
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      
        ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300
      
        ./ip link show p4p1
        32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
        DEFAULT qlen 1000
          link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
          vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
          min_tx_rate 200Mbps
          vf 1 MAC f6:c6:7c:3f:3d:6c
          vf 2 MAC 56:32:43:98:d7:71
          vf 3 MAC d6:be:c3:b5:85:ff
          vf 4 MAC ee:a9:9a:1e:19:14
          vf 5 MAC 4a:d0:4c:07:52:18
          vf 6 MAC 3a:76:44:93:62:f9
          vf 7 MAC 82:e9:e7:e3:15:1a
      Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ed616689
  23. 01 4月, 2014 1 次提交
  24. 24 1月, 2014 2 次提交
  25. 23 1月, 2014 2 次提交
  26. 18 1月, 2014 1 次提交
  27. 04 1月, 2014 3 次提交
  28. 20 12月, 2013 3 次提交