1. 09 6月, 2021 1 次提交
  2. 09 4月, 2021 1 次提交
  3. 17 11月, 2020 1 次提交
  4. 11 11月, 2020 1 次提交
  5. 31 5月, 2020 1 次提交
  6. 17 5月, 2020 1 次提交
  7. 27 4月, 2020 1 次提交
  8. 10 4月, 2020 1 次提交
    • T
      net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin · 690cc863
      Taras Chornyi 提交于
      When CONFIG_IP_MULTICAST is not set and multicast ip is added to the device
      with autojoin flag or when multicast ip is deleted kernel will crash.
      
      steps to reproduce:
      
      ip addr add 224.0.0.0/32 dev eth0
      ip addr del 224.0.0.0/32 dev eth0
      
      or
      
      ip addr add 224.0.0.0/32 dev eth0 autojoin
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088
       pc : _raw_write_lock_irqsave+0x1e0/0x2ac
       lr : lock_sock_nested+0x1c/0x60
       Call trace:
        _raw_write_lock_irqsave+0x1e0/0x2ac
        lock_sock_nested+0x1c/0x60
        ip_mc_config.isra.28+0x50/0xe0
        inet_rtm_deladdr+0x1a8/0x1f0
        rtnetlink_rcv_msg+0x120/0x350
        netlink_rcv_skb+0x58/0x120
        rtnetlink_rcv+0x14/0x20
        netlink_unicast+0x1b8/0x270
        netlink_sendmsg+0x1a0/0x3b0
        ____sys_sendmsg+0x248/0x290
        ___sys_sendmsg+0x80/0xc0
        __sys_sendmsg+0x68/0xc0
        __arm64_sys_sendmsg+0x20/0x30
        el0_svc_common.constprop.2+0x88/0x150
        do_el0_svc+0x20/0x80
       el0_sync_handler+0x118/0x190
        el0_sync+0x140/0x180
      
      Fixes: 93a714d6 ("multicast: Extend ip address command to enable multicast group join/leave on")
      Signed-off-by: NTaras Chornyi <taras.chornyi@plvision.eu>
      Signed-off-by: NVadym Kochan <vadym.kochan@plvision.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690cc863
  9. 13 3月, 2020 1 次提交
  10. 08 12月, 2019 1 次提交
    • E
      inet: protect against too small mtu values. · 501a90c9
      Eric Dumazet 提交于
      syzbot was once again able to crash a host by setting a very small mtu
      on loopback device.
      
      Let's make inetdev_valid_mtu() available in include/net/ip.h,
      and use it in ip_setup_cork(), so that we protect both ip_append_page()
      and __ip_append_data()
      
      Also add a READ_ONCE() when the device mtu is read.
      
      Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
      even if other code paths might write over this field.
      
      Add a big comment in include/linux/netdevice.h about dev->mtu
      needing READ_ONCE()/WRITE_ONCE() annotations.
      
      Hopefully we will add the missing ones in followup patches.
      
      [1]
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       panic+0x2e3/0x75c kernel/panic.c:221
       __warn.cold+0x2f/0x3e kernel/panic.c:582
       report_bug+0x289/0x300 lib/bug.c:195
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       fixup_bug arch/x86/kernel/traps.c:169 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
       invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
      RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
      RSP: 0018:ffff88809689f550 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
      RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
      R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
      R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
       refcount_add include/linux/refcount.h:193 [inline]
       skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
       sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
       ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
       udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
       inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
       kernel_sendpage+0x92/0xf0 net/socket.c:3794
       sock_sendpage+0x8b/0xc0 net/socket.c:936
       pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
       splice_from_pipe+0x108/0x170 fs/splice.c:671
       generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
       do_splice_from fs/splice.c:861 [inline]
       direct_splice_actor+0x123/0x190 fs/splice.c:1035
       splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441409
      Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
      RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
      RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
      R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
      R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 1470ddf7 ("inet: Remove explicit write references to sk/inet in ip_append_data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      501a90c9
  11. 02 7月, 2019 1 次提交
    • M
      ipv4: don't set IPv6 only flags to IPv4 addresses · 2e605463
      Matteo Croce 提交于
      Avoid the situation where an IPV6 only flag is applied to an IPv4 address:
      
          # ip addr add 192.0.2.1/24 dev dummy0 nodad home mngtmpaddr noprefixroute
          # ip -4 addr show dev dummy0
          2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
              inet 192.0.2.1/24 scope global noprefixroute dummy0
                 valid_lft forever preferred_lft forever
      
      Or worse, by sending a malicious netlink command:
      
          # ip -4 addr show dev dummy0
          2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
              inet 192.0.2.1/24 scope global nodad optimistic dadfailed home tentative mngtmpaddr noprefixroute stable-privacy dummy0
                 valid_lft forever preferred_lft forever
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e605463
  12. 28 6月, 2019 1 次提交
  13. 25 6月, 2019 2 次提交
    • S
      ipv4: fix confirm_addr_indev() when enable route_localnet · 650638a7
      Shijie Luo 提交于
      When arp_ignore=3, the NIC won't reply for scope host addresses, but
      if enable route_locanet, we need to reply ip address with head 127 and
      scope RT_SCOPE_HOST.
      
      Fixes: d0daebc3 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
      Signed-off-by: NShijie Luo <luoshijie1@huawei.com>
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      650638a7
    • S
      ipv4: fix inet_select_addr() when enable route_localnet · d8c444d5
      Shijie Luo 提交于
      Suppose we have two interfaces eth0 and eth1 in two hosts, follow
      the same steps in the two hosts:
       # sysctl -w net.ipv4.conf.eth1.route_localnet=1
       # sysctl -w net.ipv4.conf.eth1.arp_announce=2
       # ip route del 127.0.0.0/8 dev lo table local
      and then set ip to eth1 in host1 like:
       # ifconfig eth1 127.25.3.4/24
      set ip to eth2 in host2 and ping host1:
       # ifconfig eth1 127.25.3.14/24
       # ping -I eth1 127.25.3.4
      Well, host2 cannot connect to host1.
      
      When set a ip address with head 127, the scope of the address defaults
      to RT_SCOPE_HOST. In this situation, host2 will use arp_solicit() to
      send a arp request for the mac address of host1 with ip
      address 127.25.3.14. When arp_announce=2, inet_select_addr() cannot
      select a correct saddr with condition ifa->ifa_scope > scope, because
      ifa_scope is RT_SCOPE_HOST and scope is RT_SCOPE_LINK. Then,
      inet_select_addr() will go to no_in_dev to lookup all interfaces to find
      a primary ip and finally get the primary ip of eth0.
      
      Here I add a localnet_scope defaults to RT_SCOPE_HOST, and when
      route_localnet is enabled, this value changes to RT_SCOPE_LINK to make
      inet_select_addr() find a correct primary ip as saddr of arp request.
      
      Fixes: d0daebc3 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
      Signed-off-by: NShijie Luo <luoshijie1@huawei.com>
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8c444d5
  14. 18 6月, 2019 1 次提交
    • F
      net: ipv4: remove erroneous advancement of list pointer · 40008e92
      Florian Westphal 提交于
      Causes crash when lifetime expires on an adress as garbage is
      dereferenced soon after.
      
      This used to look like this:
      
       for (ifap = &ifa->ifa_dev->ifa_list;
            *ifap != NULL; ifap = &(*ifap)->ifa_next) {
                if (*ifap == ifa) ...
      
      but this was changed to:
      
      struct in_ifaddr *tmp;
      
      ifap = &ifa->ifa_dev->ifa_list;
      tmp = rtnl_dereference(*ifap);
      while (tmp) {
         tmp = rtnl_dereference(tmp->ifa_next); // Bogus
         if (rtnl_dereference(*ifap) == ifa) {
           ...
         ifap = &tmp->ifa_next;		// Can be NULL
         tmp = rtnl_dereference(*ifap);	// Dereference
         }
      }
      
      Remove the bogus assigment/list entry skip.
      
      Fixes: 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40008e92
  15. 05 6月, 2019 1 次提交
  16. 03 6月, 2019 3 次提交
  17. 31 5月, 2019 1 次提交
  18. 28 4月, 2019 1 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
  19. 05 3月, 2019 1 次提交
  20. 23 1月, 2019 1 次提交
    • C
      net: introduce a knob to control whether to inherit devconf config · 856c395c
      Cong Wang 提交于
      There have been many people complaining about the inconsistent
      behaviors of IPv4 and IPv6 devconf when creating new network
      namespaces.  Currently, for IPv4, we inherit all current settings
      from init_net, but for IPv6 we reset all setting to default.
      
      This patch introduces a new /proc file
      /proc/sys/net/core/devconf_inherit_init_net to control the
      behavior of whether to inhert sysctl current settings from init_net.
      This file itself is only available in init_net.
      
      As demonstrated below:
      
      Initial setup in init_net:
       # cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Default value 0 (current behavior):
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to 1 (inherit from init_net):
       # echo 1 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Set to 2 (reset to default):
       # echo 2 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       0
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to a value out of range (invalid):
       # echo 3 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
       # echo -1 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
      Reported-by: NZhu Yanjun <Yanjun.Zhu@windriver.com>
      Reported-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      856c395c
  21. 20 1月, 2019 1 次提交
  22. 05 1月, 2019 1 次提交
    • A
      netlink: fixup regression in RTM_GETADDR · 7c1e8a38
      Arthur Gautier 提交于
      This commit fixes a regression in AF_INET/RTM_GETADDR and
      AF_INET6/RTM_GETADDR.
      
      Before this commit, the kernel would stop dumping addresses once the first
      skb was full and end the stream with NLMSG_DONE(-EMSGSIZE). The error
      shouldn't be sent back to netlink_dump so the callback is kept alive. The
      userspace is expected to call back with a new empty skb.
      
      Changes from V1:
       - The error is not handled in netlink_dump anymore but rather in
         inet_dump_ifaddr and inet6_dump_addr directly as suggested by
         David Ahern.
      
      Fixes: d7e38611 ("net/ipv4: Put target net when address dump fails due to bad attributes")
      Fixes: 242afaa6 ("net/ipv6: Put target net when address dump fails due to bad attributes")
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "David S . Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NArthur Gautier <baloo@gandi.net>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c1e8a38
  23. 15 12月, 2018 1 次提交
    • D
      net: Allow class-e address assignment via ifconfig ioctl · 65cab850
      Dave Taht 提交于
      While most distributions long ago switched to the iproute2 suite
      of utilities, which allow class-e (240.0.0.0/4) address assignment,
      distributions relying on busybox, toybox and other forms of
      ifconfig cannot assign class-e addresses without this kernel patch.
      
      While CIDR has been obsolete for 2 decades, and a survey of all the
      open source code in the world shows the IN_whatever macros are also
      obsolete... rather than obsolete CIDR from this ioctl entirely, this
      patch merely enables class-e assignment, sanely.
      Signed-off-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65cab850
  24. 07 12月, 2018 1 次提交
  25. 26 10月, 2018 1 次提交
  26. 25 10月, 2018 1 次提交
  27. 23 10月, 2018 2 次提交
  28. 09 10月, 2018 3 次提交
  29. 06 9月, 2018 2 次提交
    • C
      ipv4: add inet_fill_args · 978a46fa
      Christian Brauner 提交于
      inet_fill_ifaddr() already took 6 arguments which meant the 7th argument
      would need to be pushed onto the stack on x86.
      Add a new struct inet_fill_args which holds common information passed
      to inet_fill_ifaddr() and shortens the function to three pointer arguments.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      978a46fa
    • C
      ipv4: enable IFA_TARGET_NETNSID for RTM_GETADDR · d3807145
      Christian Brauner 提交于
      - Backwards Compatibility:
        If userspace wants to determine whether ipv4 RTM_GETADDR requests
        support the new IFA_TARGET_NETNSID property it should verify that the
        reply includes the IFA_TARGET_NETNSID property. If it does not
        userspace should assume that IFA_TARGET_NETNSID is not supported for
        ipv4 RTM_GETADDR requests on this kernel.
      - From what I gather from current userspace tools that make use of
        RTM_GETADDR requests some of them pass down struct ifinfomsg when they
        should actually pass down struct ifaddrmsg. To not break existing
        tools that pass down the wrong struct we will do the same as for
        RTM_GETLINK | NLM_F_DUMP requests and not error out when the
        nlmsg_parse() fails.
      
      - Security:
        Callers must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3807145
  30. 30 7月, 2018 1 次提交
    • X
      route: add support for directed broadcast forwarding · 5cbf777c
      Xin Long 提交于
      This patch implements the feature described in rfc1812#section-5.3.5.2
      and rfc2644. It allows the router to forward directed broadcast when
      sysctl bc_forwarding is enabled.
      
      Note that this feature could be done by iptables -j TEE, but it would
      cause some problems:
        - target TEE's gateway param has to be set with a specific address,
          and it's not flexible especially when the route wants forward all
          directed broadcasts.
        - this duplicates the directed broadcasts so this may cause side
          effects to applications.
      
      Besides, to keep consistent with other os router like BSD, it's also
      necessary to implement it in the route rx path.
      
      Note that route cache needs to be flushed when bc_forwarding is
      changed.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cbf777c
  31. 29 5月, 2018 1 次提交
  32. 28 3月, 2018 1 次提交
  33. 13 2月, 2018 1 次提交
    • K
      net: Convert pernet_subsys, registered from inet_init() · f84c6821
      Kirill Tkhai 提交于
      arp_net_ops just addr/removes /proc entry.
      
      devinet_ops allocates and frees duplicate of init_net tables
      and (un)registers sysctl entries.
      
      fib_net_ops allocates and frees pernet tables, creates/destroys
      netlink socket and (un)initializes /proc entries. Foreign
      pernet_operations do not touch them.
      
      ip_rt_proc_ops only modifies pernet /proc entries.
      
      xfrm_net_ops creates/destroys /proc entries, allocates/frees
      pernet statistics, hashes and tables, and (un)initializes
      sysctl files. These are not touched by foreigh pernet_operations
      
      xfrm4_net_ops allocates/frees private pernet memory, and
      configures sysctls.
      
      sysctl_route_ops creates/destroys sysctls.
      
      rt_genid_ops only initializes fields of just allocated net.
      
      ipv4_inetpeer_ops allocated/frees net private memory.
      
      igmp_net_ops just creates/destroys /proc files and socket,
      noone else interested in.
      
      tcp_sk_ops seems to be safe, because tcp_sk_init() does not
      depend on any other pernet_operations modifications. Iteration
      over hash table in inet_twsk_purge() is made under RCU lock,
      and it's safe to iterate the table this way. Removing from
      the table happen from inet_twsk_deschedule_put(), but this
      function is safe without any extern locks, as it's synchronized
      inside itself. There are many examples, it's used in different
      context. So, it's safe to leave tcp_sk_exit_batch() unlocked.
      
      tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe.
      
      udplite4_net_ops only creates/destroys pernet /proc file.
      
      icmp_sk_ops creates percpu sockets, not touched by foreign
      pernet_operations.
      
      ipmr_net_ops creates/destroys pernet fib tables, (un)registers
      fib rules and /proc files. This seem to be safe to execute
      in parallel with foreign pernet_operations.
      
      af_inet_ops just sets up default parameters of newly created net.
      
      ipv4_mib_ops creates and destroys pernet percpu statistics.
      
      raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops
      and ip_proc_ops only create/destroy pernet /proc files.
      
      ip4_frags_ops creates and destroys sysctl file.
      
      So, it's safe to make the pernet_operations async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f84c6821
新手
引导
客服 返回
顶部