1. 17 1月, 2020 1 次提交
  2. 15 1月, 2020 1 次提交
  3. 17 12月, 2019 1 次提交
  4. 09 12月, 2019 1 次提交
    • J
      net: WireGuard secure network tunnel · e7096c13
      Jason A. Donenfeld 提交于
      WireGuard is a layer 3 secure networking tunnel made specifically for
      the kernel, that aims to be much simpler and easier to audit than IPsec.
      Extensive documentation and description of the protocol and
      considerations, along with formal proofs of the cryptography, are
      available at:
      
        * https://www.wireguard.com/
        * https://www.wireguard.com/papers/wireguard.pdf
      
      This commit implements WireGuard as a simple network device driver,
      accessible in the usual RTNL way used by virtual network drivers. It
      makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
      networking subsystem APIs. It has a somewhat novel multicore queueing
      system designed for maximum throughput and minimal latency of encryption
      operations, but it is implemented modestly using workqueues and NAPI.
      Configuration is done via generic Netlink, and following a review from
      the Netlink maintainer a year ago, several high profile userspace tools
      have already implemented the API.
      
      This commit also comes with several different tests, both in-kernel
      tests and out-of-kernel tests based on network namespaces, taking profit
      of the fact that sockets used by WireGuard intentionally stay in the
      namespace the WireGuard interface was originally created, exactly like
      the semantics of userspace tun devices. See wireguard.com/netns/ for
      pictures and examples.
      
      The source code is fairly short, but rather than combining everything
      into a single file, WireGuard is developed as cleanly separable files,
      making auditing and comprehension easier. Things are laid out as
      follows:
      
        * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
          cryptographic aspects of the protocol, and are mostly data-only in
          nature, taking in buffers of bytes and spitting out buffers of
          bytes. They also handle reference counting for their various shared
          pieces of data, like keys and key lists.
      
        * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
          ratelimiting certain types of cryptographic operations in accordance
          with particular WireGuard semantics.
      
        * allowedips.[ch], peerlookup.[ch]: The main lookup structures of
          WireGuard, the former being trie-like with particular semantics, an
          integral part of the design of the protocol, and the latter just
          being nice helper functions around the various hashtables we use.
      
        * device.[ch]: Implementation of functions for the netdevice and for
          rtnl, responsible for maintaining the life of a given interface and
          wiring it up to the rest of WireGuard.
      
        * peer.[ch]: Each interface has a list of peers, with helper functions
          available here for creation, destruction, and reference counting.
      
        * socket.[ch]: Implementation of functions related to udp_socket and
          the general set of kernel socket APIs, for sending and receiving
          ciphertext UDP packets, and taking care of WireGuard-specific sticky
          socket routing semantics for the automatic roaming.
      
        * netlink.[ch]: Userspace API entry point for configuring WireGuard
          peers and devices. The API has been implemented by several userspace
          tools and network management utility, and the WireGuard project
          distributes the basic wg(8) tool.
      
        * queueing.[ch]: Shared function on the rx and tx path for handling
          the various queues used in the multicore algorithms.
      
        * send.c: Handles encrypting outgoing packets in parallel on
          multiple cores, before sending them in order on a single core, via
          workqueues and ring buffers. Also handles sending handshake and cookie
          messages as part of the protocol, in parallel.
      
        * receive.c: Handles decrypting incoming packets in parallel on
          multiple cores, before passing them off in order to be ingested via
          the rest of the networking subsystem with GRO via the typical NAPI
          poll function. Also handles receiving handshake and cookie messages
          as part of the protocol, in parallel.
      
        * timers.[ch]: Uses the timer wheel to implement protocol particular
          event timeouts, and gives a set of very simple event-driven entry
          point functions for callers.
      
        * main.c, version.h: Initialization and deinitialization of the module.
      
        * selftest/*.h: Runtime unit tests for some of the most security
          sensitive functions.
      
        * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
          script using network namespaces.
      
      This commit aims to be as self-contained as possible, implementing
      WireGuard as a standalone module not needing much special handling or
      coordination from the network subsystem. I expect for future
      optimizations to the network stack to positively improve WireGuard, and
      vice-versa, but for the time being, this exists as intentionally
      standalone.
      
      We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
      verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7096c13
  5. 22 11月, 2019 1 次提交
  6. 26 9月, 2019 1 次提交
  7. 21 5月, 2019 1 次提交
  8. 25 3月, 2019 1 次提交
  9. 19 3月, 2019 1 次提交
  10. 27 2月, 2019 1 次提交
  11. 15 2月, 2019 1 次提交
  12. 13 2月, 2019 1 次提交
  13. 09 2月, 2019 1 次提交
    • D
      ipvlan: decouple l3s mode dependencies from other modes · c675e06a
      Daniel Borkmann 提交于
      Right now ipvlan has a hard dependency on CONFIG_NETFILTER and
      otherwise it cannot be built. However, the only ipvlan operation
      mode that actually depends on netfilter is l3s, everything else
      is independent of it. Break this hard dependency such that users
      are able to use ipvlan l3 mode on systems where netfilter is not
      compiled in.
      
      Therefore, this adds a hidden CONFIG_IPVLAN_L3S bool which is
      defaulting to y when CONFIG_NETFILTER is set in order to retain
      existing behavior for l3s. All l3s related code is refactored
      into ipvlan_l3s.c that is compiled in when enabled.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Martynas Pumputis <m@lambda.lt>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c675e06a
  14. 19 1月, 2019 1 次提交
  15. 06 12月, 2018 1 次提交
  16. 29 5月, 2018 2 次提交
    • S
      virtio_net: Extend virtio to use VF datapath when available · ba5e4426
      Sridhar Samudrala 提交于
      This patch enables virtio_net to switch over to a VF datapath when STANDBY
      feature is enabled and a VF netdev is present with the same MAC address.
      It allows live migration of a VM with a direct attached VF without the need
      to setup a bond/team between a VF and virtio net device in the guest.
      
      It uses the API that is exported by the net_failover driver to create and
      and destroy a master failover netdev. When STANDBY feature is enabled, an
      additional netdev(failover netdev) is created that acts as a master device
      and tracks the state of the 2 lower netdevs. The original virtio_net netdev
      is marked as 'standby' netdev and a passthru device with the same MAC is
      registered as 'primary' netdev.
      
      The hypervisor needs to unplug the VF device from the guest on the source
      host and reset the MAC filter of the VF to initiate failover of datapath
      to virtio before starting the migration. After the migration is completed,
      the destination hypervisor sets the MAC filter on the VF and plugs it back
      to the guest to switch over to VF datapath.
      
      This patch is based on the discussion initiated by Jesse on this thread.
      https://marc.info/?l=linux-virtualization&m=151189725224231&w=2Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba5e4426
    • S
      net: Introduce net_failover driver · cfc80d9a
      Sridhar Samudrala 提交于
      The net_failover driver provides an automated failover mechanism via APIs
      to create and destroy a failover master netdev and manages a primary and
      standby slave netdevs that get registered via the generic failover
      infrastructure.
      
      The failover netdev acts a master device and controls 2 slave devices. The
      original paravirtual interface gets registered as 'standby' slave netdev and
      a passthru/vf device with the same MAC gets registered as 'primary' slave
      netdev. Both 'standby' and 'failover' netdevs are associated with the same
      'pci' device. The user accesses the network interface via 'failover' netdev.
      The 'failover' netdev chooses 'primary' netdev as default for transmits when
      it is available with link up and running.
      
      This can be used by paravirtual drivers to enable an alternate low latency
      datapath. It also enables hypervisor controlled live migration of a VM with
      direct attached VF by failing over to the paravirtual datapath when the VF
      is unplugged.
      Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfc80d9a
  17. 28 4月, 2018 1 次提交
  18. 30 3月, 2018 1 次提交
    • D
      netdevsim: Add simple FIB resource controller via devlink · 37923ed6
      David Ahern 提交于
      Add devlink support to netdevsim and use it to implement a simple,
      profile based resource controller. Only one controller is needed
      per namespace, so the first netdevsim netdevice in a namespace
      registers with devlink. If that device is deleted, the resource
      settings are deleted.
      
      The resource controller allows a user to limit the number of IPv4 and
      IPv6 FIB entries and FIB rules. The resource paths are:
          /IPv4
          /IPv4/fib
          /IPv4/fib-rules
          /IPv6
          /IPv6/fib
          /IPv6/fib-rules
      
      The IPv4 and IPv6 top level resources are unlimited in size and can not
      be changed. From there, the number of FIB entries and FIB rule entries
      are unlimited by default. A user can specify a limit for the fib and
      fib-rules resources:
      
          $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
          $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
          $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
          $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
          $ devlink dev reload netdevsim/netdevsim0
      
      such that the number of rules or routes is limited (96 ipv4 routes in the
      example above):
          $ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done
          Error: netdevsim: Exceeded number of supported fib entries.
      
          $ devlink resource show netdevsim/netdevsim0
          netdevsim/netdevsim0:
            name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non
              resources:
                name fib size 96 occ 96 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables
          ...
      
      With this template in place for resource management, it is fairly trivial
      to extend and shows one way to implement a simple counter based resource
      controller typical of network profiles.
      
      Currently, devlink only supports initial namespace. Code is in place to
      adapt netdevsim to a per namespace controller once the network namespace
      issues are resolved.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37923ed6
  19. 27 2月, 2018 1 次提交
    • A
      ipvlan: fix building with modular IPV6 · 7f897db3
      Arnd Bergmann 提交于
      We no longer depend on IPV6, but that now causes a link error with
      CONFIG_IPV6=m and CONFIG_IPVLAN=y:
      
      drivers/net/ipvlan/ipvlan_core.o: In function `ipvlan_queue_xmit':
      ipvlan_core.c:(.text+0x1440): undefined reference to `ip6_route_output_flags'
      drivers/net/ipvlan/ipvlan_core.o: In function `ipvlan_l3_rcv':
      ipvlan_core.c:(.text+0x1818): undefined reference to `ip6_route_input_lookup'
      
      This adds back the dependency on IPV6, with the option of building without
      IPV6, but forcing IPVLAN to be a module when IPV6 is a module.
      
      Fixes: 94333fac ("ipvlan: drop ipv6 dependency")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f897db3
  20. 22 2月, 2018 2 次提交
  21. 03 12月, 2017 1 次提交
  22. 03 10月, 2017 1 次提交
  23. 24 8月, 2017 1 次提交
  24. 25 4月, 2017 1 次提交
  25. 18 2月, 2017 1 次提交
    • A
      vmxnet3: prevent building with 64K pages · fbdf0e28
      Arnd Bergmann 提交于
      I got a warning about broken code on ARM64 with 64K pages:
      
      drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_rq_init':
      drivers/net/vmxnet3/vmxnet3_drv.c:1679:29: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
          rq->buf_info[0][i].len = PAGE_SIZE;
      
      'len' here is a 16-bit integer, so this clearly won't work. I don't think
      this driver is used much on anything other than x86, so there is no need
      to fix this properly and we can work around it with a Kconfig dependency
      to forbid known-broken configurations. qemu in theory supports it on
      other architectures too, but presumably only for compatibility with x86
      guests that also run on vmware.
      
      CONFIG_PAGE_SIZE_64KB is used on hexagon, mips, sh and tile, the other
      symbols are architecture-specific names for the same thing.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbdf0e28
  26. 12 2月, 2017 2 次提交
  27. 09 2月, 2017 1 次提交
  28. 21 9月, 2016 1 次提交
  29. 19 9月, 2016 1 次提交
    • M
      ipvlan: Introduce l3s mode · 4fbae7d8
      Mahesh Bandewar 提交于
      In a typical IPvlan L3 setup where master is in default-ns and
      each slave is into different (slave) ns. In this setup egress
      packet processing for traffic originating from slave-ns will
      hit all NF_HOOKs in slave-ns as well as default-ns. However same
      is not true for ingress processing. All these NF_HOOKs are
      hit only in the slave-ns skipping them in the default-ns.
      IPvlan in L3 mode is restrictive and if admins want to deploy
      iptables rules in default-ns, this asymmetric data path makes it
      impossible to do so.
      
      This patch makes use of the l3_rcv() (added as part of l3mdev
      enhancements) to perform input route lookup on RX packets without
      changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
      to change the skb->dev just before handing over skb to L4.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      CC: David Ahern <dsa@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fbae7d8
  30. 11 5月, 2016 1 次提交
  31. 26 4月, 2016 1 次提交
  32. 18 4月, 2016 1 次提交
    • A
      macsec: fix crypto Kconfig dependency · ab2ed017
      Arnd Bergmann 提交于
      The new MACsec driver uses the AES crypto algorithm, but can be configured
      even if CONFIG_CRYPTO is disabled, leading to a build error:
      
      warning: (MAC80211 && MACSEC) selects CRYPTO_GCM which has unmet direct dependencies (CRYPTO)
      warning: (BT && CEPH_LIB && INET && MAC802154 && MAC80211 && BLK_DEV_RBD && MACSEC && AIRO_CS && LIBIPW && HOSTAP && USB_WUSB && RTLLIB_CRYPTO_CCMP && FS_ENCRYPTION && EXT4_ENCRYPTION && CEPH_FS && BIG_KEYS && ENCRYPTED_KEYS) selects CRYPTO_AES which has unmet direct dependencies (CRYPTO)
      crypto/built-in.o: In function `gcm_enc_copy_hash':
      aes_generic.c:(.text+0x2b8): undefined reference to `crypto_xor'
      aes_generic.c:(.text+0x2dc): undefined reference to `scatterwalk_map_and_copy'
      
      This adds an explicit 'select CRYPTO' statement the way that other
      drivers handle it.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab2ed017
  33. 14 3月, 2016 1 次提交
  34. 13 10月, 2015 1 次提交
  35. 30 9月, 2015 1 次提交
  36. 28 8月, 2015 1 次提交
  37. 25 8月, 2015 1 次提交