1. 30 9月, 2015 1 次提交
  2. 22 7月, 2015 1 次提交
  3. 14 5月, 2015 1 次提交
  4. 07 1月, 2015 1 次提交
  5. 03 12月, 2014 1 次提交
  6. 28 10月, 2014 1 次提交
    • A
      bpf: split eBPF out of NET · f89b7755
      Alexei Starovoitov 提交于
      introduce two configs:
      - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
        depend on
      - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use
      
      that solves several problems:
      - tracing and others that wish to use eBPF don't need to depend on NET.
        They can use BPF_SYSCALL to allow loading from userspace or select BPF
        to use it directly from kernel in NET-less configs.
      - in 3.18 programs cannot be attached to events yet, so don't force it on
      - when the rest of eBPF infra is there in 3.19+, it's still useful to
        switch it off to minimize kernel size
      
      bloat-o-meter on x64 shows:
      add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)
      
      tested with many different config combinations. Hopefully didn't miss anything.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f89b7755
  7. 11 10月, 2014 1 次提交
  8. 01 10月, 2014 1 次提交
  9. 27 9月, 2014 1 次提交
    • P
      netfilter: bridge: move br_netfilter out of the core · 34666d46
      Pablo Neira Ayuso 提交于
      Jesper reported that br_netfilter always registers the hooks since
      this is part of the bridge core. This harms performance for people that
      don't need this.
      
      This patch modularizes br_netfilter so it can be rmmod'ed, thus,
      the hooks can be unregistered. I think the bridge netfilter should have
      been a separated module since the beginning, Patrick agreed on that.
      
      Note that this is breaking compatibility for users that expect that
      bridge netfilter is going to be available after explicitly 'modprobe
      bridge' or via automatic load through brctl.
      
      However, the damage can be easily undone by modprobing br_netfilter.
      The bridge core also spots a message to provide a clue to people that
      didn't notice that this has been deprecated.
      
      On top of that, the plan is that nftables will not rely on this software
      layer, but integrate the connection tracking into the bridge layer to
      enable stateful filtering and NAT, which is was bridge netfilter users
      seem to require.
      
      This patch still keeps the fake_dst_ops in the bridge core, since this
      is required by when the bridge port is initialized. So we can safely
      modprobe/rmmod br_netfilter anytime.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      34666d46
  10. 12 7月, 2014 1 次提交
  11. 02 4月, 2014 1 次提交
    • D
      net: ptp: move PTP classifier in its own file · 408eccce
      Daniel Borkmann 提交于
      This commit fixes a build error reported by Fengguang, that is
      triggered when CONFIG_NETWORK_PHY_TIMESTAMPING is not set:
      
        ERROR: "ptp_classify_raw" [drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.ko] undefined!
      
      The fix is to introduce its own file for the PTP BPF classifier,
      so that PTP_1588_CLOCK and/or NETWORK_PHY_TIMESTAMPING can select
      it independently from each other. IXP4xx driver on ARM needs to
      select it as well since it does not seem to select PTP_1588_CLOCK
      or similar that would pull it in automatically.
      
      This also allows for hiding all of the internals of the BPF PTP
      program inside that file, and only exporting relevant API bits
      to drivers.
      
      This patch also adds a kdoc documentation of ptp_classify_raw()
      API to make it clear that it can return PTP_CLASS_* defines. Also,
      the BPF program has been translated into bpf_asm code, so that it
      can be more easily read and altered (extensively documented in [1]).
      
      In the kernel tree under tools/net/ we have bpf_asm and bpf_dbg
      tools, so the commented program can simply be translated via
      `./bpf_asm -c prog` where prog is a file that contains the
      commented code. This makes it easily readable/verifiable and when
      there's a need to change something, jump offsets etc do not need
      to be replaced manually which can be very error prone. Instead,
      a newly translated version via bpf_asm can simply replace the old
      code. I have checked opcode diffs before/after and it's the very
      same filter.
      
        [1] Documentation/networking/filter.txt
      
      Fixes: 164d8c66 ("net: ptp: do not reimplement PTP/BPF classifier")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      408eccce
  12. 08 2月, 2014 1 次提交
    • T
      cgroup: make CONFIG_CGROUP_NET_PRIO bool and drop unnecessary init_netclassid_cgroup() · af636337
      Tejun Heo 提交于
      net_prio is the only cgroup which is allowed to be built as a module.
      The savings from allowing one controller to be built as a module are
      tiny especially given that cgroup module support itself adds quite a
      bit of complexity.
      
      Given that none of other controllers has much chance of being made a
      module and that we're unlikely to add new modular controllers, the
      added complexity is simply not justifiable.
      
      As a first step to drop cgroup module support, this patch changes the
      config option to bool from tristate and drops module related code from
      it.
      
      Also, while an earlier commit fe1217c4 ("net: net_cls: move
      cgroupfs classid handling into core") dropped module support from
      net_cls cgroup, it retained a call to cgroup_load_subsys(), which is
      noop for built-in controllers.  Drop it along with
      init_netclassid_cgroup().
      
      v2: Removed modular version of task_netprioidx() in
          include/net/netprio_cgroup.h as suggested by Li Zefan.
      
      v3: Rebased on top of fe1217c4 ("net: net_cls: move cgroupfs
          classid handling into core").  net_cls cgroup part is mostly
          dropped except for removal of init_netclassid_cgroup().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      af636337
  13. 04 1月, 2014 2 次提交
  14. 22 11月, 2013 1 次提交
  15. 04 11月, 2013 1 次提交
    • A
      net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0) · f421436a
      Arvid Brodin 提交于
      High-availability Seamless Redundancy ("HSR") provides instant failover
      redundancy for Ethernet networks. It requires a special network topology where
      all nodes are connected in a ring (each node having two physical network
      interfaces). It is suited for applications that demand high availability and
      very short reaction time.
      
      HSR acts on the Ethernet layer, using a registered Ethernet protocol type to
      send special HSR frames in both directions over the ring. The driver creates
      virtual network interfaces that can be used just like any ordinary Linux
      network interface, for IP/TCP/UDP traffic etc. All nodes in the network ring
      must be HSR capable.
      
      This code is a "best effort" to comply with the HSR standard as described in
      IEC 62439-3:2010 (HSRv0).
      Signed-off-by: NArvid Brodin <arvid.brodin@xdin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f421436a
  16. 13 9月, 2013 1 次提交
  17. 02 8月, 2013 1 次提交
  18. 31 7月, 2013 1 次提交
  19. 18 6月, 2013 2 次提交
  20. 11 6月, 2013 1 次提交
  21. 06 6月, 2013 1 次提交
  22. 28 5月, 2013 1 次提交
    • S
      MPLS: Add limited GSO support · 0d89d203
      Simon Horman 提交于
      In the case where a non-MPLS packet is received and an MPLS stack is
      added it may well be the case that the original skb is GSO but the
      NIC used for transmit does not support GSO of MPLS packets.
      
      The aim of this code is to provide GSO in software for MPLS packets
      whose skbs are GSO.
      
      SKB Usage:
      
      When an implementation adds an MPLS stack to a non-MPLS packet it should do
      the following to skb metadata:
      
      * Set skb->inner_protocol to the old non-MPLS ethertype of the packet.
        skb->inner_protocol is added by this patch.
      
      * Set skb->protocol to the new MPLS ethertype of the packet.
      
      * Set skb->network_header to correspond to the
        end of the L3 header, including the MPLS label stack.
      
      I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to
      kernel" which adds MPLS support to the kernel datapath of Open vSwtich.
      That patch sets the above requirements in datapath/actions.c:push_mpls()
      and was used to exercise this code.  The datapath patch is against the Open
      vSwtich tree but it is intended that it be added to the Open vSwtich code
      present in the mainline Linux kernel at some point.
      
      Features:
      
      I believe that the approach that I have taken is at least partially
      consistent with the handling of other protocols.  Jesse, I understand that
      you have some ideas here.  I am more than happy to change my implementation.
      
      This patch adds dev->mpls_features which may be used by devices
      to advertise features supported for MPLS packets.
      
      A new NETIF_F_MPLS_GSO feature is added for devices which support
      hardware MPLS GSO offload.  Currently no devices support this
      and MPLS GSO always falls back to software.
      
      Alternate Implementation:
      
      One possible alternate implementation is to teach netif_skb_features()
      and skb_network_protocol() about MPLS, in a similar way to their
      understanding of VLANs. I believe this would avoid the need
      for net/mpls/mpls_gso.c and in particular the calls to
      __skb_push() and __skb_push() in mpls_gso_segment().
      
      I have decided on the implementation in this patch as it should
      not introduce any overhead in the case where mpls_gso is not compiled
      into the kernel or inserted as a module.
      
      MPLS GSO suggested by Jesse Gross.
      Based in part on "v4 GRE: Add TCP segmentation offload for GRE"
      by Pravin B Shelar.
      
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d89d203
  23. 21 5月, 2013 1 次提交
    • W
      rps: selective flow shedding during softnet overflow · 99bbc707
      Willem de Bruijn 提交于
      A cpu executing the network receive path sheds packets when its input
      queue grows to netdev_max_backlog. A single high rate flow (such as a
      spoofed source DoS) can exceed a single cpu processing rate and will
      degrade throughput of other flows hashed onto the same cpu.
      
      This patch adds a more fine grained hashtable. If the netdev backlog
      is above a threshold, IRQ cpus track the ratio of total traffic of
      each flow (using 4096 buckets, configurable). The ratio is measured
      by counting the number of packets per flow over the last 256 packets
      from the source cpu. Any flow that occupies a large fraction of this
      (set at 50%) will see packet drop while above the threshold.
      
      Tested:
      Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0,
      kernel receive (RPS) on cpu0 and application threads on cpus 2--7
      each handling 20k req/s. Throughput halves when hit with a 400 kpps
      antagonist storm. With this patch applied, antagonist overload is
      dropped and the server processes its complete load.
      
      The patch is effective when kernel receive processing is the
      bottleneck. The above RPS scenario is a extreme, but the same is
      reached with RFS and sufficient kernel processing (iptables, packet
      socket tap, ..).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99bbc707
  24. 02 5月, 2013 1 次提交
  25. 20 4月, 2013 1 次提交
  26. 22 3月, 2013 1 次提交
    • A
      netlink: Diag core and basic socket info dumping (v2) · eaaa3139
      Andrey Vagin 提交于
      The netlink_diag can be built as a module, just like it's done in
      unix sockets.
      
      The core dumping message carries the basic info about netlink sockets:
      family, type and protocol, portis, dst_group, dst_portid, state.
      
      Groups can be received as an optional parameter NETLINK_DIAG_GROUPS.
      
      Netlink sockets cab be filtered by protocols.
      
      The socket inode number and cookie is reserved for future per-socket info
      retrieving. The per-protocol filtering is also reserved for future by
      requiring the sdiag_protocol to be zero.
      
      The file /proc/net/netlink doesn't provide enough information for
      dumping netlink sockets. It doesn't provide dst_group, dst_portid,
      groups above 32.
      
      v2: fix NETLINK_DIAG_MAX. Now it's equal to the last constant.
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eaaa3139
  27. 11 2月, 2013 1 次提交
    • A
      VSOCK: Introduce VM Sockets · d021c344
      Andy King 提交于
      VM Sockets allows communication between virtual machines and the hypervisor.
      User level applications both in a virtual machine and on the host can use the
      VM Sockets API, which facilitates fast and efficient communication between
      guest virtual machines and their host.  A socket address family, designed to be
      compatible with UDP and TCP at the interface level, is provided.
      
      Today, VM Sockets is used by various VMware Tools components inside the guest
      for zero-config, network-less access to VMware host services.  In addition to
      this, VMware's users are using VM Sockets for various applications, where
      network access of the virtual machine is restricted or non-existent.  Examples
      of this are VMs communicating with device proxies for proprietary hardware
      running as host applications and automated testing of applications running
      within virtual machines.
      
      The VMware VM Sockets are similar to other socket types, like Berkeley UNIX
      socket interface.  The VM Sockets module supports both connection-oriented
      stream sockets like TCP, and connectionless datagram sockets like UDP. The VM
      Sockets protocol family is defined as "AF_VSOCK" and the socket operations
      split for SOCK_DGRAM and SOCK_STREAM.
      
      For additional information about the use of VM Sockets, please refer to the
      VM Sockets Programming Guide available at:
      
      https://www.vmware.com/support/developer/vmci-sdk/Signed-off-by: NGeorge Zhang <georgezhang@vmware.com>
      Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
      Signed-off-by: NAndy king <acking@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d021c344
  28. 01 2月, 2013 1 次提交
    • P
      wanrouter: completely decouple obsolete code from kernel. · a786a7c0
      Paul Gortmaker 提交于
      The original suggestion to delete wanrouter started earlier
      with the mainline commit f0d1b3c2
      ("net/wanrouter: Deprecate and schedule for removal") in May 2012.
      
      More importantly, Dan Carpenter found[1] that the driver had a
      fundamental breakage introduced back in 2008, with commit
      7be6065b ("netdevice wanrouter: Convert directly reference of
      netdev->priv").  So we know with certainty that the code hasn't been
      used by anyone willing to at least take the effort to send an e-mail
      report of breakage for at least 4 years.
      
      This commit does a decouple of the wanrouter subsystem, by going
      after the Makefile/Kconfig and similar files, so that these mainline
      files that we are keeping do not have the big wanrouter file/driver
      deletion commit tied into their history.
      
      Once this commit is in place, we then can remove the obsolete cyclomx
      drivers and similar that have a dependency on CONFIG_WAN_ROUTER_DRIVERS.
      
      [1] http://www.spinics.net/lists/netdev/msg218670.htmlOriginally-by: NJoe Perches <joe@perches.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      a786a7c0
  29. 12 1月, 2013 1 次提交
  30. 11 1月, 2013 1 次提交
  31. 05 9月, 2012 1 次提交
    • D
      net: Add INET dependency on aes crypto for the sake of TCP fastopen. · 798b2cbf
      David S. Miller 提交于
      Stephen Rothwell says:
      
      ====================
      After merging the final tree, today's linux-next build (powerpc
      ppc44x_defconfig) failed like this:
      
      net/built-in.o: In function `tcp_fastopen_ctx_free':
      tcp_fastopen.c:(.text+0x5cc5c): undefined reference to `crypto_destroy_tfm'
      net/built-in.o: In function `tcp_fastopen_reset_cipher':
      (.text+0x5cccc): undefined reference to `crypto_alloc_base'
      net/built-in.o: In function `tcp_fastopen_reset_cipher':
      (.text+0x5cd6c): undefined reference to `crypto_destroy_tfm'
      
      Presumably caused by commit 10467163 ("tcp: TCP Fast Open Server -
      header & support functions") from the net-next tree.  I assume that some
      dependency on the CRYPTO infrastructure is missing.
      
      I have reverted commit 1bed966c ("Merge branch
      'tcp_fastopen_server'") for today.
      ====================
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      798b2cbf
  32. 22 5月, 2012 1 次提交
  33. 18 5月, 2012 2 次提交
  34. 17 5月, 2012 1 次提交
  35. 04 12月, 2011 1 次提交
    • J
      net: Add Open vSwitch kernel components. · ccb1352e
      Jesse Gross 提交于
      Open vSwitch is a multilayer Ethernet switch targeted at virtualized
      environments.  In addition to supporting a variety of features
      expected in a traditional hardware switch, it enables fine-grained
      programmatic extension and flow-based control of the network.
      This control is useful in a wide variety of applications but is
      particularly important in multi-server virtualization deployments,
      which are often characterized by highly dynamic endpoints and the need
      to maintain logical abstractions for multiple tenants.
      
      The Open vSwitch datapath provides an in-kernel fast path for packet
      forwarding.  It is complemented by a userspace daemon, ovs-vswitchd,
      which is able to accept configuration from a variety of sources and
      translate it into packet processing rules.
      
      See http://openvswitch.org for more information and userspace
      utilities.
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      ccb1352e
  36. 30 11月, 2011 1 次提交
    • T
      bql: Byte queue limits · 114cf580
      Tom Herbert 提交于
      Networking stack support for byte queue limits, uses dynamic queue
      limits library.  Byte queue limits are maintained per transmit queue,
      and a dql structure has been added to netdev_queue structure for this
      purpose.
      
      Configuration of bql is in the tx-<n> sysfs directory for the queue
      under the byte_queue_limits directory.  Configuration includes:
      limit_min, bql minimum limit
      limit_max, bql maximum limit
      hold_time, bql slack hold time
      
      Also under the directory are:
      limit, current byte limit
      inflight, current number of bytes on the queue
      Signed-off-by: NTom Herbert <therbert@google.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      114cf580
  37. 23 11月, 2011 1 次提交
    • N
      net: add network priority cgroup infrastructure (v4) · 5bc1421e
      Neil Horman 提交于
      This patch adds in the infrastructure code to create the network priority
      cgroup.  The cgroup, in addition to the standard processes file creates two
      control files:
      
      1) prioidx - This is a read-only file that exports the index of this cgroup.
      This is a value that is both arbitrary and unique to a cgroup in this subsystem,
      and is used to index the per-device priority map
      
      2) priomap - This is a writeable file.  On read it reports a table of 2-tuples
      <name:priority> where name is the name of a network interface and priority is
      indicates the priority assigned to frames egresessing on the named interface and
      originating from a pid in this cgroup
      
      This cgroup allows for skb priority to be set prior to a root qdisc getting
      selected. This is benenficial for DCB enabled systems, in that it allows for any
      application to use dcb configured priorities so without application modification
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      CC: Robert Love <robert.w.love@intel.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bc1421e