1. 03 2月, 2015 1 次提交
  2. 02 2月, 2015 2 次提交
  3. 31 1月, 2015 1 次提交
  4. 29 1月, 2015 1 次提交
  5. 24 1月, 2015 1 次提交
  6. 20 1月, 2015 4 次提交
  7. 19 1月, 2015 1 次提交
  8. 18 1月, 2015 5 次提交
    • M
      ip_tunnel: Create percpu gro_cell · 88340160
      Martin KaFai Lau 提交于
      In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
      All skb will be queued to the same cell->napi_skbs.  The
      gro_cell_poll is pinned to one core under load.  In production traffic,
      we also see severe rx_dropped in the tunl iface and it is probably due to
      this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.
      
      This patch is trying to alloc_percpu(struct gro_cell) and schedule
      gro_cell_poll to process the skb in the same core.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88340160
    • J
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg 提交于
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      053c095a
    • J
      net: replace br_fdb_external_learn_* calls with switchdev notifier events · 3aeb6617
      Jiri Pirko 提交于
      This patch benefits from newly introduced switchdev notifier and uses it
      to propagate fdb learn events from rocker driver to bridge. That avoids
      direct function calls and possible use by other listeners (ovs).
      Suggested-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3aeb6617
    • J
      switchdev: introduce switchdev notifier · 03bf0c28
      Jiri Pirko 提交于
      This patch introduces new notifier for purposes of exposing events which happen
      on switch driver side. The consumers of the event messages are mainly involved
      masters, namely bridge and ovs.
      Suggested-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03bf0c28
    • J
      tc: add BPF based action · d23b8ad8
      Jiri Pirko 提交于
      This action provides a possibility to exec custom BPF code.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d23b8ad8
  9. 16 1月, 2015 3 次提交
  10. 15 1月, 2015 6 次提交
    • J
      cfg80211: remove 80+80 MHz rate reporting · 97d910d0
      Johannes Berg 提交于
      These rates are treated the same as 160 MHz in the spec, so
      it makes no sense to distinguish them. As no driver uses them
      yet, this is also not a problem, just remove them.
      
      In the userspace API the field remains reserved to preserve
      API and ABI.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      97d910d0
    • J
      mac80211: remove 80+80 MHz rate reporting · f89903d5
      Johannes Berg 提交于
      These rates are treated the same as 160 MHz in the spec,
      so it makes no sense to distinguish them. As no driver
      uses them yet, this is also not a problem, just remove
      them.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      f89903d5
    • T
      openvswitch: Support VXLAN Group Policy extension · 1dd144cf
      Thomas Graf 提交于
      Introduces support for the group policy extension to the VXLAN virtual
      port. The extension is disabled by default and only enabled if the user
      has provided the respective configuration.
      
        ovs-vsctl add-port br0 vxlan0 -- \
           set Interface vxlan0 type=vxlan options:exts=gbp
      
      The configuration interface to enable the extension is based on a new
      attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION
      which can carry additional extensions as needed in the future.
      
      The group policy metadata is stored as binary blob (struct ovs_vxlan_opts)
      internally just like Geneve options but transported as nested Netlink
      attributes to user space.
      
      Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the
      binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced.
      
      The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing
      OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dd144cf
    • T
      vxlan: Only bind to sockets with compatible flags enabled · ac5132d1
      Thomas Graf 提交于
      A VXLAN net_device looking for an appropriate socket may only consider
      a socket which has a matching set of flags/extensions enabled. If
      incompatible flags are enabled, return a conflict to have the caller
      create a distinct socket with distinct port.
      
      The OVS VXLAN port is kept unaware of extensions at this point.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac5132d1
    • T
      vxlan: Group Policy extension · 3511494c
      Thomas Graf 提交于
      Implements supports for the Group Policy VXLAN extension [0] to provide
      a lightweight and simple security label mechanism across network peers
      based on VXLAN. The security context and associated metadata is mapped
      to/from skb->mark. This allows further mapping to a SELinux context
      using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
      tc, etc.
      
      The group membership is defined by the lower 16 bits of skb->mark, the
      upper 16 bits are used for flags.
      
      SELinux allows to manage label to secure local resources. However,
      distributed applications require ACLs to implemented across hosts. This
      is typically achieved by matching on L2-L4 fields to identify the
      original sending host and process on the receiver. On top of that,
      netlabel and specifically CIPSO [1] allow to map security contexts to
      universal labels.  However, netlabel and CIPSO are relatively complex.
      This patch provides a lightweight alternative for overlay network
      environments with a trusted underlay. No additional control protocol
      is required.
      
                 Host 1:                       Host 2:
      
            Group A        Group B        Group B     Group A
            +-----+   +-------------+    +-------+   +-----+
            | lxc |   | SELinux CTX |    | httpd |   | VM  |
            +--+--+   +--+----------+    +---+---+   +--+--+
      	  \---+---/                     \----+---/
      	      |                              |
      	  +---+---+                      +---+---+
      	  | vxlan |                      | vxlan |
      	  +---+---+                      +---+---+
      	      +------------------------------+
      
      Backwards compatibility:
      A VXLAN-GBP socket can receive standard VXLAN frames and will assign
      the default group 0x0000 to such frames. A Linux VXLAN socket will
      drop VXLAN-GBP  frames. The extension is therefore disabled by default
      and needs to be specifically enabled:
      
         ip link add [...] type vxlan [...] gbp
      
      In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
      must run on a separate port number.
      
      Examples:
       iptables:
        host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
        host2# iptables -I INPUT -m mark --mark 0x200 -j DROP
      
       OVS:
        # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL'
        # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop'
      
      [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy
      [1] http://lwn.net/Articles/204905/Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3511494c
    • T
      vxlan: Remote checksum offload · dfd8645e
      Tom Herbert 提交于
      Add support for remote checksum offload in VXLAN. This uses a
      reserved bit to indicate that RCO is being done, and uses the low order
      reserved eight bits of the VNI to hold the start and offset values in a
      compressed manner.
      
      Start is encoded in the low order seven bits of VNI. This is start >> 1
      so that the checksum start offset is 0-254 using even values only.
      Checksum offset (transport checksum field) is indicated in the high
      order bit in the low order byte of the VNI. If the bit is set, the
      checksum field is for UDP (so offset = start + 6), else checksum
      field is for TCP (so offset = start + 16). Only TCP and UDP are
      supported in this implementation.
      
      Remote checksum offload for VXLAN is described in:
      
      https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
      
      Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).
      
      With UDP checksums and Remote Checksum Offload
        IPv4
            Client
              11.84% CPU utilization
            Server
              12.96% CPU utilization
            9197 Mbps
        IPv6
            Client
              12.46% CPU utilization
            Server
              14.48% CPU utilization
            8963 Mbps
      
      With UDP checksums, no remote checksum offload
        IPv4
            Client
              15.67% CPU utilization
            Server
              14.83% CPU utilization
            9094 Mbps
        IPv6
            Client
              16.21% CPU utilization
            Server
              14.32% CPU utilization
            9058 Mbps
      
      No UDP checksums
        IPv4
            Client
              15.03% CPU utilization
            Server
              23.09% CPU utilization
            9089 Mbps
        IPv6
            Client
              16.18% CPU utilization
            Server
              26.57% CPU utilization
             8954 Mbps
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfd8645e
  11. 14 1月, 2015 3 次提交
  12. 13 1月, 2015 4 次提交
  13. 12 1月, 2015 1 次提交
  14. 08 1月, 2015 7 次提交
    • J
      nl80211: support per-TID station statistics · 6de39808
      Johannes Berg 提交于
      The base for the current statistics is pretty mixed up, support
      exporting RX/TX statistics for MSDUs per TID. This (currently)
      covers received MSDUs, transmitted MSDUs and retries/failures
      thereof.
      
      Doing it per TID for MSDUs makes more sense than say only per AC
      because it's symmetric - we could export per-AC statistics for all
      frames (which AC we used for transmission can be determined also
      for management frames) but per TID is better and usually data
      frames are really the ones we care about. Also, on RX we can't
      determine the AC - but we do know the TID for any QoS MPDU we
      received.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      6de39808
    • J
      nl80211: clarify packet statistics descriptions · 8d791361
      Johannes Berg 提交于
      The current statistics we keep aren't very clear, some are on
      MPDUs and some on MSDUs/MMPDUs. Clarify the descriptions based
      on the counters mac80211 keeps.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8d791361
    • J
      cfg80211: add nl80211 beacon-only statistics · a76b1942
      Johannes Berg 提交于
      Add these two values:
       * BEACON_RX: number of beacons received from this peer
       * BEACON_SIGNAL_AVG: signal strength average for beacons only
      
      These can then be used for Android Lollipop's statistics request.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a76b1942
    • J
      cfg80211: remove enum station_info_flags · 319090bf
      Johannes Berg 提交于
      This is really just duplicating the list of information that's
      already available in the nl80211 attribute, so remove the list.
      Two small changes are needed:
       * remove STATION_INFO_ASSOC_REQ_IES complete, but the length
         (assoc_req_ies_len) can be used instead
       * add NL80211_STA_INFO_RX_DROP_MISC which exists internally
         but not in nl80211 yet
      
      This gets rid of the duplicate maintenance of the two lists.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      319090bf
    • J
      mac80211: allow drivers to provide most station statistics · 2b9a7e1b
      Johannes Berg 提交于
      In many cases, drivers can filter things like beacons that will
      skew statistics reported by mac80211. To get correct statistics
      in these cases, call drivers to obtain statistics and let them
      override all values, filling values from mac80211 if the driver
      didn't provide them. Not all of them make sense for the driver
      to fill, so some are still always done by mac80211.
      
      Note that this doesn't currently allow a driver to say "I know
      this value is wrong, don't report it at all", or to sum it up
      with a mac80211 value (as could be useful for "dropped misc"),
      that can be added if it turns out to be needed.
      
      This also gets rid of the get_rssi() method as is can now be
      implemented using sta_statistics().
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      2b9a7e1b
    • J
      cfg80211: allow including station info in delete event · cf5ead82
      Johannes Berg 提交于
      When a station is removed, its statistics may be interesting to
      userspace, for example for further aggregation of statistics of
      all stations that ever connected to an AP.
      
      Introduce a new cfg80211_del_sta_sinfo() function (and make the
      cfg80211_del_sta() a static inline calling it) to allow passing
      a struct station_info along with this, and send the data in the
      nl80211 event message.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      cf5ead82
    • J
      cfg80211: add scan time to survey data · 052536ab
      Johannes Berg 提交于
      Add the time spent scanning to the survey data so it can be
      reported by drivers that collect such information.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      052536ab