1. 09 2月, 2017 1 次提交
    • L
      cfg80211: fix NAN bands definition · 8585989d
      Luca Coelho 提交于
      The nl80211_nan_dual_band_conf enumeration doesn't make much sense.
      The default value is assigned to a bit, which makes it weird if the
      default bit and other bits are set at the same time.
      
      To improve this, get rid of NL80211_NAN_BAND_DEFAULT and add a wiphy
      configuration to let the drivers define which bands are supported.
      This is exposed to the userspace, which then can make a decision on
      which band(s) to use.  Additionally, rename all "dual_band" elements
      to "bands", to make things clearer.
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8585989d
  2. 08 2月, 2017 1 次提交
    • A
      cfg80211: Pass new RSSI level in CQM RSSI notification · bee427b8
      Andrzej Zaborowski 提交于
      Update the drivers to pass the RSSI level as a cfg80211_cqm_rssi_notify
      parameter and pass this value to userspace in a new nl80211 attribute.
      This helps both userspace and also helps in the implementation of the
      multiple RSSI thresholds CQM mechanism.
      
      Note for marvell/mwifiex I pass 0 for the RSSI value because the new
      RSSI value is not available to the driver at the time of the
      cfg80211_cqm_rssi_notify call, but the driver queries the new value
      immediately after that, so it is actually available just a moment later
      if we wanted to defer caling cfg80211_cqm_rssi_notify until that moment.
      Without this, the new cfg80211 code (patch 3) will call .get_station
      which will send a duplicate HostCmd_CMD_RSSI_INFO command to the hardware.
      Signed-off-by: NAndrew Zaborowski <andrew.zaborowski@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      bee427b8
  3. 24 1月, 2017 1 次提交
  4. 21 1月, 2017 2 次提交
    • J
      tipc: make replicast a user selectable option · 01fd12bb
      Jon Paul Maloy 提交于
      If the bearer carrying multicast messages supports broadcast, those
      messages will be sent to all cluster nodes, irrespective of whether
      these nodes host any actual destinations socket or not. This is clearly
      wasteful if the cluster is large and there are only a few real
      destinations for the message being sent.
      
      In this commit we extend the eligibility of the newly introduced
      "replicast" transmit option. We now make it possible for a user to
      select which method he wants to be used, either as a mandatory setting
      via setsockopt(), or as a relative setting where we let the broadcast
      layer decide which method to use based on the ratio between cluster
      size and the message's actual number of destination nodes.
      
      In the latter case, a sending socket must stick to a previously
      selected method until it enters an idle period of at least 5 seconds.
      This eliminates the risk of message reordering caused by method change,
      i.e., when changes to cluster size or number of destinations would
      otherwise mandate a new method to be used.
      Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01fd12bb
    • G
      bpf: add bpf_probe_read_str helper · a5e8c070
      Gianluca Borello 提交于
      Provide a simple helper with the same semantics of strncpy_from_unsafe():
      
      int bpf_probe_read_str(void *dst, int size, const void *unsafe_addr)
      
      This gives more flexibility to a bpf program. A typical use case is
      intercepting a file name during sys_open(). The current approach is:
      
      SEC("kprobe/sys_open")
      void bpf_sys_open(struct pt_regs *ctx)
      {
      	char buf[PATHLEN]; // PATHLEN is defined to 256
      	bpf_probe_read(buf, sizeof(buf), ctx->di);
      
      	/* consume buf */
      }
      
      This is suboptimal because the size of the string needs to be estimated
      at compile time, causing more memory to be copied than often necessary,
      and can become more problematic if further processing on buf is done,
      for example by pushing it to userspace via bpf_perf_event_output(),
      since the real length of the string is unknown and the entire buffer
      must be copied (and defining an unrolled strnlen() inside the bpf
      program is a very inefficient and unfeasible approach).
      
      With the new helper, the code can easily operate on the actual string
      length rather than the buffer size:
      
      SEC("kprobe/sys_open")
      void bpf_sys_open(struct pt_regs *ctx)
      {
      	char buf[PATHLEN]; // PATHLEN is defined to 256
      	int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di);
      
      	/* consume buf, for example push it to userspace via
      	 * bpf_perf_event_output(), but this time we can use
      	 * res (the string length) as event size, after checking
      	 * its boundaries.
      	 */
      }
      
      Another useful use case is when parsing individual process arguments or
      individual environment variables navigating current->mm->arg_start and
      current->mm->env_start: using this helper and the return value, one can
      quickly iterate at the right offset of the memory area.
      
      The code changes simply leverage the already existent
      strncpy_from_unsafe() kernel function, which is safe to be called from a
      bpf program as it is used in bpf_trace_printk().
      Signed-off-by: NGianluca Borello <g.borello@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5e8c070
  5. 19 1月, 2017 2 次提交
  6. 18 1月, 2017 3 次提交
    • L
      bridge: sparse fixes in br_ip6_multicast_alloc_query() · 53631a5f
      Lance Richardson 提交于
      Changed type of csum field in struct igmpv3_query from __be16 to
      __sum16 to eliminate type warning, made same change in struct
      igmpv3_report for consistency.
      
      Fixed up an ntohs() where htons() should have been used instead.
      Signed-off-by: NLance Richardson <lrichard@redhat.com>
      Acked-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53631a5f
    • R
      mpls: Packet stats · 27d69105
      Robert Shearman 提交于
      Having MPLS packet stats is useful for observing network operation and
      for diagnosing network problems. In the absence of anything better,
      RFC2863 and RFC3813 are used for guidance for which stats to expose
      and the semantics of them. In particular rx_noroutes maps to in
      unknown protos in RFC2863. The stats are exposed to userspace via
      AF_MPLS attributes embedded in the IFLA_STATS_AF_SPEC attribute of
      RTM_GETSTATS messages.
      
      All the introduced fields are 64-bit, even error ones, to ensure no
      overflow with long uptimes. Per-CPU counters are used to avoid
      cache-line contention on the commonly used fields. The other fields
      have also been made per-CPU for code to avoid performance problems in
      error conditions on the assumption that on some platforms the cost of
      atomic operations could be more expensive than sending the packet
      (which is what would be done in the success case). If that's not the
      case, we could instead not use per-CPU counters for these fields.
      
      Only unicast and non-fragment are exposed at the moment, but other
      counters can be exposed in the future either by adding to the end of
      struct mpls_link_stats or by additional netlink attributes in the
      AF_MPLS IFLA_STATS_AF_SPEC nested attribute.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27d69105
    • R
      net: AF-specific RTM_GETSTATS attributes · aefb4d4a
      Robert Shearman 提交于
      Add the functionality for including address-family-specific per-link
      stats in RTM_GETSTATS messages. This is done through adding a new
      IFLA_STATS_AF_SPEC attribute under which address family attributes are
      nested and then the AF-specific attributes can be further nested. This
      follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the
      advantage of presenting an easily extended hierarchy. The rtnl_af_ops
      structure is extended to provide AFs with the opportunity to fill and
      provide the size of their stats attributes.
      
      One alternative would have been to provide AFs with the ability to add
      attributes directly into the RTM_GETSTATS message without a nested
      hierarchy. I discounted this approach as it increases the rate at
      which the 32 attribute number space is used up and it makes
      implementation a little more tricky for stats dump resuming (at the
      moment the order in which attributes are added to the message has to
      match the numeric order of the attributes).
      
      Another alternative would have been to register per-AF RTM_GETSTATS
      handlers. I discounted this approach as I perceived a common use-case
      to be getting all the stats for an interface and this approach would
      necessitate multiple requests/dumps to retrieve them all.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aefb4d4a
  7. 17 1月, 2017 2 次提交
    • D
      ipv6: sr: add missing Kbuild export for header files · a50a05f4
      David Lebrun 提交于
      Add missing IPv6-SR header files in include/uapi/linux/Kbuild.
      
      Also, prevent seg6_lwt_headroom() from being exported and add
      missing linux/types.h include.
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a50a05f4
    • D
      bpf: rework prog_digest into prog_tag · f1f7714e
      Daniel Borkmann 提交于
      Commit 7bd509e3 ("bpf: add prog_digest and expose it via
      fdinfo/netlink") was recently discussed, partially due to
      admittedly suboptimal name of "prog_digest" in combination
      with sha1 hash usage, thus inevitably and rightfully concerns
      about its security in terms of collision resistance were
      raised with regards to use-cases.
      
      The intended use cases are for debugging resp. introspection
      only for providing a stable "tag" over the instruction sequence
      that both kernel and user space can calculate independently.
      It's not usable at all for making a security relevant decision.
      So collisions where two different instruction sequences generate
      the same tag can happen, but ideally at a rather low rate. The
      "tag" will be dumped in hex and is short enough to introspect
      in tracepoints or kallsyms output along with other data such
      as stack trace, etc. Thus, this patch performs a rename into
      prog_tag and truncates the tag to a short output (64 bits) to
      make it obvious it's not collision-free.
      
      Should in future a hash or facility be needed with a security
      relevant focus, then we can think about requirements, constraints,
      etc that would fit to that situation. For now, rework the exposed
      parts for the current use cases as long as nothing has been
      released yet. Tested on x86_64 and s390x.
      
      Fixes: 7bd509e3 ("bpf: add prog_digest and expose it via fdinfo/netlink")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1f7714e
  8. 13 1月, 2017 3 次提交
  9. 12 1月, 2017 1 次提交
  10. 11 1月, 2017 2 次提交
  11. 10 1月, 2017 4 次提交
  12. 09 1月, 2017 6 次提交
    • A
      cfg80211: NL80211_ATTR_SOCKET_OWNER support for CMD_CONNECT · bd2522b1
      Andrzej Zaborowski 提交于
      Disconnect or deauthenticate when the owning socket is closed if this
      flag is supplied to CMD_CONNECT or CMD_ASSOCIATE.  This may be used
      to ensure userspace daemon doesn't leave an unmanaged connection behind.
      
      In some situations it would be possible to account for that, to some
      degree, in the deamon restart code or in the up/down scripts without
      the use of this attribute.  But there will be systems where the daemon
      can go away for varying periods without a warning due to local resource
      management.
      Signed-off-by: NAndrew Zaborowski <andrew.zaborowski@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      bd2522b1
    • W
      net-tc: convert tc_from to tc_from_ingress and tc_redirected · bc31c905
      Willem de Bruijn 提交于
      The tc_from field fulfills two roles. It encodes whether a packet was
      redirected by an act_mirred device and, if so, whether act_mirred was
      called on ingress or egress. Split it into separate fields.
      
      The information is needed by the special IFB loop, where packets are
      taken out of the normal path by act_mirred, forwarded to IFB, then
      reinjected at their original location (ingress or egress) by IFB.
      
      The IFB device cannot use skb->tc_at_ingress, because that may have
      been overwritten as the packet travels from act_mirred to ifb_xmit,
      when it passes through tc_classify on the IFB egress path. Cache this
      value in skb->tc_from_ingress.
      
      That field is valid only if a packet arriving at ifb_xmit came from
      act_mirred. Other packets can be crafted to reach ifb_xmit. These
      must be dropped. Set tc_redirected on redirection and drop all packets
      that do not have this bit set.
      
      Both fields are set only on cloned skbs in tc actions, so original
      packet sources do not have to clear the bit when reusing packets
      (notably, pktgen and octeon).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc31c905
    • W
      net-tc: convert tc_verd to integer bitfields · a5135bcf
      Willem de Bruijn 提交于
      Extract the remaining two fields from tc_verd and remove the __u16
      completely. TC_AT and TC_FROM are converted to equivalent two-bit
      integer fields tc_at and tc_from. Where possible, use existing
      helper skb_at_tc_ingress when reading tc_at. Introduce helper
      skb_reset_tc to clear fields.
      
      Not documenting tc_from and tc_at, because they will be replaced
      with single bit fields in follow-on patches.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5135bcf
    • W
      net-tc: extract skip classify bit from tc_verd · e7246e12
      Willem de Bruijn 提交于
      Packets sent by the IFB device skip subsequent tc classification.
      A single bit governs this state. Move it out of tc_verd in
      anticipation of removing that __u16 completely.
      
      The new bitfield tc_skip_classify temporarily uses one bit of a
      hole, until tc_verd is removed completely in a follow-up patch.
      
      Remove the bit hole comment. It could be 2, 3, 4 or 5 bits long.
      With that many options, little value in documenting it.
      
      Introduce a helper function to deduplicate the logic in the two
      sites that check this bit.
      
      The field tc_skip_classify is set only in IFB on skbs cloned in
      act_mirred, so original packet sources do not have to clear the
      bit when reusing packets (notably, pktgen and octeon).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7246e12
    • W
      net-tc: make MAX_RECLASSIFY_LOOP local · d6264071
      Willem de Bruijn 提交于
      This field is no longer kept in tc_verd. Remove it from the global
      definition of that struct.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6264071
    • W
      net-tc: remove unused tc_verd fields · aec745e2
      Willem de Bruijn 提交于
      Remove the last reference to tc_verd's munge and redirect ttl bits.
      These fields are no longer used.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aec745e2
  13. 08 1月, 2017 1 次提交
  14. 03 1月, 2017 2 次提交
  15. 02 1月, 2017 1 次提交
  16. 16 12月, 2016 5 次提交
  17. 15 12月, 2016 2 次提交
    • A
      IB: Add vmw_pvrdma driver · 29c8d9eb
      Adit Ranadive 提交于
      This patch series adds a driver for a paravirtual RDMA device. The
      device is developed for VMware's Virtual Machines and allows existing RDMA
      applications to continue to use existing Verbs API when deployed in VMs
      on ESXi. We recently did a presentation in the OFA Workshop [1] regarding
      this device.
      
      Description and RDMA Support
      ============================
      The virtual device is exposed as a dual function PCIe device. One part
      is a virtual network device (VMXNet3) which provides networking properties
      like MAC, IP addresses to the RDMA part of the device. The networking
      properties are used to register GIDs required by RDMA applications to
      communicate.
      
      These patches add support and the all required infrastructure for
      letting applications use such a device. We support the mandatory Verbs API as
      well as the base memory management extensions (Local Inv, Send with Inv and
      Fast Register Work Requests). We currently support both Reliable Connected
      and Unreliable Datagram QPs but do not support Shared Receive Queues
      (SRQs).
      
      Also, we support the following types of Work Requests:
       o Send/Receive (with or without Immediate Data)
       o RDMA Write (with or without Immediate Data)
       o RDMA Read
       o Local Invalidate
       o Send with Invalidate
       o Fast Register Work Requests
      
      This version only adds support for version 1 of RoCE. We will add RoCEv2
      support in a future patch. We do support registration of both MAC-based
      and IP-based GIDs. I have also created a git tree for our user-level driver
      [2].
      
      Testing
      =======
      We have tested this internally for various types of Guest OS - Red Hat,
      Centos, Ubuntu 12.04/14.04/16.04, Oracle Enterprise Linux, SLES 12
      using backported versions of this driver. The tests included several
      runs of the performance tests (included with OFED), Intel MPI PingPong
      benchmark on OpenMPI, krping for FRWRs. Mellanox has been kind enough
      to test the backported version of the driver internally on their hardware
      using a VMware provided ESX build. I have also applied and tested this
      with Doug's k.o/for-4.9 branch (commit 5603910b). Note, that this patch
      series should be applied all together. I split out the commits so that
      it may be easier to review.
      
      PVRDMA Resources
      ================
      [1] OFA Workshop Presentation -
      https://openfabrics.org/images/eventpresos/2016presentations/102parardma.pdf
      
      [2] Libpvrdma User-level library -
      http://git.openfabrics.org/?p=~aditr/libpvrdma.git;a=summaryReviewed-by: NJorgen Hansen <jhansen@vmware.com>
      Reviewed-by: NGeorge Zhang <georgezhang@vmware.com>
      Reviewed-by: NAditya Sarwade <asarwade@vmware.com>
      Reviewed-by: NBryan Tan <bryantan@vmware.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NAdit Ranadive <aditr@vmware.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      29c8d9eb
    • J
      rdma UAPI: Use __kernel_sockaddr_storage · 35493294
      Jason Gunthorpe 提交于
      The kernel side is #ifdef'd to this type, and the UAPI header
      should use it directly. It has slightly different alignment
      requirments from the usual user space version.
      Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      35493294
  18. 14 12月, 2016 1 次提交