1. 27 3月, 2018 1 次提交
  2. 05 3月, 2018 1 次提交
  3. 02 3月, 2018 1 次提交
  4. 01 3月, 2018 1 次提交
  5. 03 11月, 2017 1 次提交
    • T
      ipv6: Implement limits on Hop-by-Hop and Destination options · 47d3d7ac
      Tom Herbert 提交于
      RFC 8200 (IPv6) defines Hop-by-Hop options and Destination options
      extension headers. Both of these carry a list of TLVs which is
      only limited by the maximum length of the extension header (2048
      bytes). By the spec a host must process all the TLVs in these
      options, however these could be used as a fairly obvious
      denial of service attack. I think this could in fact be
      a significant DOS vector on the Internet, one mitigating
      factor might be that many FWs drop all packets with EH (and
      obviously this is only IPv6) so an Internet wide attack might not
      be so effective (yet!).
      
      By my calculation, the worse case packet with TLVs in a standard
      1500 byte MTU packet that would be processed by the stack contains
      1282 invidual TLVs (including pad TLVS) or 724 two byte TLVs. I
      wrote a quick test program that floods a whole bunch of these
      packets to a host and sure enough there is substantial time spent
      in ip6_parse_tlv. These packets contain nothing but unknown TLVS
      (that are ignored), TLV padding, and bogus UDP header with zero
      payload length.
      
        25.38%  [kernel]                    [k] __fib6_clean_all
        21.63%  [kernel]                    [k] ip6_parse_tlv
         4.21%  [kernel]                    [k] __local_bh_enable_ip
         2.18%  [kernel]                    [k] ip6_pol_route.isra.39
         1.98%  [kernel]                    [k] fib6_walk_continue
         1.88%  [kernel]                    [k] _raw_write_lock_bh
         1.65%  [kernel]                    [k] dst_release
      
      This patch adds configurable limits to Destination and Hop-by-Hop
      options. There are three limits that may be set:
        - Limit the number of options in a Hop-by-Hop or Destination options
          extension header.
        - Limit the byte length of a Hop-by-Hop or Destination options
          extension header.
        - Disallow unrecognized options in a Hop-by-Hop or Destination
          options extension header.
      
      The limits are set in corresponding sysctls:
      
        ipv6.sysctl.max_dst_opts_cnt
        ipv6.sysctl.max_hbh_opts_cnt
        ipv6.sysctl.max_dst_opts_len
        ipv6.sysctl.max_hbh_opts_len
      
      If a max_*_opts_cnt is less than zero then unknown TLVs are disallowed.
      The number of known TLVs that are allowed is the absolute value of
      this number.
      
      If a limit is exceeded when processing an extension header the packet is
      dropped.
      
      Default values are set to 8 for options counts, and set to INT_MAX
      for maximum length. Note the choice to limit options to 8 is an
      arbitrary guess (roughly based on the fact that the stack supports
      three HBH options and just one destination option).
      
      These limits have being proposed in draft-ietf-6man-rfc6434-bis.
      
      Tested (by Martin Lau)
      
      I tested out 1 thread (i.e. one raw_udp process).
      
      I changed the net.ipv6.max_dst_(opts|hbh)_number between 8 to 2048.
      With sysctls setting to 2048, the softirq% is packed to 100%.
      With 8, the softirq% is almost unnoticable from mpstat.
      
      v2;
        - Code and documention cleanup.
        - Change references of RFC2460 to be RFC8200.
        - Add reference to RFC6434-bis where the limits will be in standard.
      Signed-off-by: NTom Herbert <tom@quantonium.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d3d7ac
  6. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  7. 20 9月, 2017 1 次提交
    • E
      ipv6: addrlabel: per netns list · a90c9347
      Eric Dumazet 提交于
      Having a global list of labels do not scale to thousands of
      netns in the cloud era. This causes quadratic behavior on
      netns creation and deletion.
      
      This is time having a per netns list of ~10 labels.
      
      Tested:
      
      $ time perf record (for f in `seq 1 3000` ; do ip netns add tast$f; done)
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 3.637 MB perf.data (~158898 samples) ]
      
      real    0m20.837s # instead of 0m24.227s
      user    0m0.328s
      sys     0m20.338s # instead of 0m23.753s
      
          16.17%       ip  [kernel.kallsyms]  [k] netlink_broadcast_filtered
          12.30%       ip  [kernel.kallsyms]  [k] netlink_has_listeners
           6.76%       ip  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
           5.78%       ip  [kernel.kallsyms]  [k] memset_erms
           5.77%       ip  [kernel.kallsyms]  [k] kobject_uevent_env
           5.18%       ip  [kernel.kallsyms]  [k] refcount_sub_and_test
           4.96%       ip  [kernel.kallsyms]  [k] _raw_read_lock
           3.82%       ip  [kernel.kallsyms]  [k] refcount_inc_not_zero
           3.33%       ip  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
           2.11%       ip  [kernel.kallsyms]  [k] unmap_page_range
           1.77%       ip  [kernel.kallsyms]  [k] __wake_up
           1.69%       ip  [kernel.kallsyms]  [k] strlen
           1.17%       ip  [kernel.kallsyms]  [k] __wake_up_common
           1.09%       ip  [kernel.kallsyms]  [k] insert_header
           1.04%       ip  [kernel.kallsyms]  [k] page_remove_rmap
           1.01%       ip  [kernel.kallsyms]  [k] consume_skb
           0.98%       ip  [kernel.kallsyms]  [k] netlink_trim
           0.51%       ip  [kernel.kallsyms]  [k] kernfs_link_sibling
           0.51%       ip  [kernel.kallsyms]  [k] filemap_map_pages
           0.46%       ip  [kernel.kallsyms]  [k] memcpy_erms
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a90c9347
  8. 25 8月, 2017 1 次提交
    • J
      ipv6: Add sysctl for per namespace flow label reflection · 22b6722b
      Jakub Sitnicki 提交于
      Reflecting IPv6 Flow Label at server nodes is useful in environments
      that employ multipath routing to load balance the requests. As "IPv6
      Flow Label Reflection" standard draft [1] points out - ICMPv6 PTB error
      messages generated in response to a downstream packets from the server
      can be routed by a load balancer back to the original server without
      looking at transport headers, if the server applies the flow label
      reflection. This enables the Path MTU Discovery past the ECMP router in
      load-balance or anycast environments where each server node is reachable
      by only one path.
      
      Introduce a sysctl to enable flow label reflection per net namespace for
      all newly created sockets. Same could be earlier achieved only per
      socket by setting the IPV6_FL_F_REFLECT flag for the IPV6_FLOWLABEL_MGR
      socket option.
      
      [1] https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22b6722b
  9. 09 8月, 2017 1 次提交
    • V
      net: ipv6: avoid overhead when no custom FIB rules are installed · feca7d8c
      Vincent Bernat 提交于
      If the user hasn't installed any custom rules, don't go through the
      whole FIB rules layer. This is pretty similar to f4530fa5 (ipv4:
      Avoid overhead when no custom FIB rules are installed).
      
      Using a micro-benchmark module [1], timing ip6_route_output() with
      get_cycles(), with 40,000 routes in the main routing table, before this
      patch:
      
          min=606 max=12911 count=627 average=1959 95th=4903 90th=3747 50th=1602 mad=821
          table=254 avgdepth=21.8 maxdepth=39
          value │                         ┊                            count
            600 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                                         199
            880 │▒▒▒░░░░░░░░░░░░░░░░                                      43
           1160 │▒▒▒░░░░░░░░░░░░░░░░░░░░                                  48
           1440 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░                               43
           1720 │▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░                          59
           2000 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                      50
           2280 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                    26
           2560 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                  31
           2840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░               28
           3120 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              17
           3400 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             17
           3680 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             8
           3960 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           11
           4240 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░            6
           4520 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           6
           4800 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           9
      
      After:
      
          min=544 max=11687 count=627 average=1776 95th=4546 90th=3585 50th=1227 mad=565
          table=254 avgdepth=21.8 maxdepth=39
          value │                         ┊                            count
            540 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                                        201
            800 │▒▒▒▒▒░░░░░░░░░░░░░░░░                                    63
           1060 │▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░                               68
           1320 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░                            39
           1580 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                         32
           1840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                       32
           2100 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                    34
           2360 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                 33
           2620 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░               26
           2880 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              22
           3140 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              9
           3400 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             8
           3660 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             9
           3920 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░            8
           4180 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           8
           4440 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           8
      
      At the frequency of the host during the bench (~ 3.7 GHz), this is
      about a 100 ns difference on the median value.
      
      A next step would be to collapse local and main tables, as in
      0ddcf43d (ipv4: FIB Local/MAIN table collapse).
      
      [1]: https://github.com/vincentbernat/network-lab/blob/master/lab-routes-ipv6/kbench_mod.cSigned-off-by: NVincent Bernat <vincent@bernat.im>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      feca7d8c
  10. 04 8月, 2017 1 次提交
  11. 10 11月, 2016 1 次提交
    • D
      ipv6: sr: add code base for control plane support of SR-IPv6 · 915d7e5e
      David Lebrun 提交于
      This patch adds the necessary hooks and structures to provide support
      for SR-IPv6 control plane, essentially the Generic Netlink commands
      that will be used for userspace control over the Segment Routing
      kernel structures.
      
      The genetlink commands provide control over two different structures:
      tunnel source and HMAC data. The tunnel source is the source address
      that will be used by default when encapsulating packets into an
      outer IPv6 header + SRH. If the tunnel source is set to :: then an
      address of the outgoing interface will be selected as the source.
      
      The HMAC commands currently just return ENOTSUPP and will be implemented
      in a future patch.
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      915d7e5e
  12. 09 3月, 2016 2 次提交
  13. 10 7月, 2015 1 次提交
    • T
      ipv6: Nonlocal bind · 35a256fe
      Tom Herbert 提交于
      Add support to allow non-local binds similar to how this was done for IPv4.
      Non-local binds are very useful in emulating the Internet in a box, etc.
      
      This add the ip_nonlocal_bind sysctl under ipv6.
      
      Testing:
      
      Set up nonlocal binding and receive routing on a host, e.g.:
      
      ip -6 rule add from ::/0 iif eth0 lookup 200
      ip -6 route add local 2001:0:0:1::/64 dev lo proto kernel scope host table 200
      sysctl -w net.ipv6.ip_nonlocal_bind=1
      
      Set up routing to 2001:0:0:1::/64 on peer to go to first host
      
      ping6 -I 2001:0:0:1::1 peer-address -- to verify
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35a256fe
  14. 04 5月, 2015 1 次提交
    • T
      ipv6: Flow label state ranges · 82a584b7
      Tom Herbert 提交于
      This patch divides the IPv6 flow label space into two ranges:
      0-7ffff is reserved for flow label manager, 80000-fffff will be
      used for creating auto flow labels (per RFC6438). This only affects how
      labels are set on transmit, it does not affect receive. This range split
      can be disbaled by systcl.
      
      Background:
      
      IPv6 flow labels have been an unmitigated disappointment thus far
      in the lifetime of IPv6. Support in HW devices to use them for ECMP
      is lacking, and OSes don't turn them on by default. If we had these
      we could get much better hashing in IPv6 networks without resorting
      to DPI, possibly eliminating some of the motivations to to define new
      encaps in UDP just for getting ECMP.
      
      Unfortunately, the initial specfications of IPv6 did not clarify
      how they are to be used. There has always been a vague concept that
      these can be used for ECMP, flow hashing, etc. and we do now have a
      good standard how to this in RFC6438. The problem is that flow labels
      can be either stateful or stateless (as in RFC6438), and we are
      presented with the possibility that a stateless label may collide
      with a stateful one.  Attempts to split the flow label space were
      rejected in IETF. When we added support in Linux for RFC6438, we
      could not turn on flow labels by default due to this conflict.
      
      This patch splits the flow label space and should give us
      a path to enabling auto flow labels by default for all IPv6 packets.
      This is an API change so we need to consider compatibility with
      existing deployment. The stateful range is chosen to be the lower
      values in hopes that most uses would have chosen small numbers.
      
      Once we resolve the stateless/stateful issue, we can proceed to
      look at enabling RFC6438 flow labels by default (starting with
      scaled testing).
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82a584b7
  15. 24 3月, 2015 1 次提交
  16. 28 2月, 2015 1 次提交
    • M
      multicast: Extend ip address command to enable multicast group join/leave on · 93a714d6
      Madhu Challa 提交于
      Joining multicast group on ethernet level via "ip maddr" command would
      not work if we have an Ethernet switch that does igmp snooping since
      the switch would not replicate multicast packets on ports that did not
      have IGMP reports for the multicast addresses.
      
      Linux vxlan interfaces created via "ip link add vxlan" have the group option
      that enables then to do the required join.
      
      By extending ip address command with option "autojoin" we can get similar
      functionality for openvswitch vxlan interfaces as well as other tunneling
      mechanisms that need to receive multicast traffic. The kernel code is
      structured similar to how the vxlan driver does a group join / leave.
      
      example:
      ip address add 224.1.1.10/24 dev eth5 autojoin
      ip address del 224.1.1.10/24 dev eth5
      Signed-off-by: NMadhu Challa <challa@noironetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93a714d6
  17. 07 10月, 2014 1 次提交
  18. 08 7月, 2014 1 次提交
    • T
      ipv6: Implement automatic flow label generation on transmit · cb1ce2ef
      Tom Herbert 提交于
      Automatically generate flow labels for IPv6 packets on transmit.
      The flow label is computed based on skb_get_hash. The flow label will
      only automatically be set when it is zero otherwise (i.e. flow label
      manager hasn't set one). This supports the transmit side functionality
      of RFC 6438.
      
      Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
      system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
      functionality per socket.
      
      By default, auto flowlabels are disabled to avoid possible conflicts
      with flow label manager, however if this feature proves useful we
      may want to enable it by default.
      
      It should also be noted that FreeBSD has already implemented automatic
      flow labels (including the sysctl and socket option). In FreeBSD,
      automatic flow labels default to enabled.
      
      Performance impact:
      
      Running super_netperf with 200 flows for TCP_RR and UDP_RR for
      IPv6. Note that in UDP case, __skb_get_hash will be called for
      every packet with explains slight regression. In the TCP case
      the hash is saved in the socket so there is no regression.
      
      Automatic flow labels disabled:
      
        TCP_RR:
          86.53% CPU utilization
          127/195/322 90/95/99% latencies
          1.40498e+06 tps
      
        UDP_RR:
          90.70% CPU utilization
          118/168/243 90/95/99% latencies
          1.50309e+06 tps
      
      Automatic flow labels enabled:
      
        TCP_RR:
          85.90% CPU utilization
          128/199/337 90/95/99% latencies
          1.40051e+06
      
        UDP_RR
          92.61% CPU utilization
          115/164/236 90/95/99% latencies
          1.4687e+06
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb1ce2ef
  19. 14 5月, 2014 1 次提交
    • L
      net: add a sysctl to reflect the fwmark on replies · e110861f
      Lorenzo Colitti 提交于
      Kernel-originated IP packets that have no user socket associated
      with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
      are emitted with a mark of zero. Add a sysctl to make them have
      the same mark as the packet they are replying to.
      
      This allows an administrator that wishes to do so to use
      mark-based routing, firewalling, etc. for these replies by
      marking the original packets inbound.
      
      Tested using user-mode linux:
       - ICMP/ICMPv6 echo replies and errors.
       - TCP RST packets (IPv4 and IPv6).
      Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e110861f
  20. 20 1月, 2014 1 次提交
  21. 15 1月, 2014 1 次提交
  22. 08 1月, 2014 1 次提交
  23. 01 8月, 2013 1 次提交
  24. 25 3月, 2013 1 次提交
  25. 06 2月, 2013 1 次提交
  26. 20 9月, 2012 1 次提交
  27. 30 8月, 2012 1 次提交
  28. 09 6月, 2012 1 次提交
  29. 21 4月, 2012 1 次提交
  30. 11 5月, 2010 4 次提交
    • P
      ipv6: ip6mr: support multiple tables · d1db275d
      Patrick McHardy 提交于
      This patch adds support for multiple independant multicast routing instances,
      named "tables".
      
      Userspace multicast routing daemons can bind to a specific table instance by
      issuing a setsockopt call using a new option MRT6_TABLE. The table number is
      stored in the raw socket data and affects all following ip6mr setsockopt(),
      getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
      is created with a default routing rule pointing to it. Newly created pim6reg
      devices have the table number appended ("pim6regX"), with the exception of
      devices created in the default table, which are named just "pim6reg" for
      compatibility reasons.
      
      Packets are directed to a specific table instance using routing rules,
      similar to how regular routing rules work. Currently iif, oif and mark
      are supported as keys, source and destination addresses could be supported
      additionally.
      
      Example usage:
      
      - bind pimd/xorp/... to a specific table:
      
      uint32_t table = 123;
      setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));
      
      - create routing rules directing packets to the new table:
      
      # ip -6 mrule add iif eth0 lookup 123
      # ip -6 mrule add oif eth0 lookup 123
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d1db275d
    • P
      6bd52143
    • P
      f30a7784
    • P
      ipv6: ip6mr: move unres_queue and timer to per-namespace data · c476efbc
      Patrick McHardy 提交于
      The unres_queue is currently shared between all namespaces. Following patches
      will additionally allow to create multiple multicast routing tables in each
      namespace. Having a single shared queue for all these users seems to excessive,
      move the queue and the cleanup timer to the per-namespace data to unshare it.
      
      As a side-effect, this fixes a bug in the seq file iteration functions: the
      first entry returned is always from the current namespace, entries returned
      after that may belong to any namespace.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      c476efbc
  31. 18 1月, 2010 1 次提交
  32. 02 9月, 2009 1 次提交
    • A
      netns: embed ip6_dst_ops directly · 86393e52
      Alexey Dobriyan 提交于
      struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated,
      but there is no fundamental reason for it. Embed it directly into
      struct netns_ipv6.
      
      For that:
      * move struct dst_ops into separate header to fix circular dependencies
      	I honestly tried not to, it's pretty impossible to do other way
      * drop dynamical allocation, allocate together with netns
      
      For a change, remove struct dst_ops::dst_net, it's deducible
      by using container_of() given dst_ops pointer.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86393e52
  33. 11 12月, 2008 4 次提交