1. 20 7月, 2021 1 次提交
    • V
      memcg: enable accounting for IP address and routing-related objects · 6126891c
      Vasily Averin 提交于
      An netadmin inside container can use 'ip a a' and 'ip r a'
      to assign a large number of ipv4/ipv6 addresses and routing entries
      and force kernel to allocate megabytes of unaccounted memory
      for long-lived per-netdevice related kernel objects:
      'struct in_ifaddr', 'struct inet6_ifaddr', 'struct fib6_node',
      'struct rt6_info', 'struct fib_rules' and ip_fib caches.
      
      These objects can be manually removed, though usually they lives
      in memory till destroy of its net namespace.
      
      It makes sense to account for them to restrict the host's memory
      consumption from inside the memcg-limited container.
      
      One of such objects is the 'struct fib6_node' mostly allocated in
      net/ipv6/route.c::__ip6_ins_rt() inside the lock_bh()/unlock_bh() section:
      
       write_lock_bh(&table->tb6_lock);
       err = fib6_add(&table->tb6_root, rt, info, mxc);
       write_unlock_bh(&table->tb6_lock);
      
      In this case it is not enough to simply add SLAB_ACCOUNT to corresponding
      kmem cache. The proper memory cgroup still cannot be found due to the
      incorrect 'in_interrupt()' check used in memcg_kmem_bypass().
      
      Obsoleted in_interrupt() does not describe real execution context properly.
      >From include/linux/preempt.h:
      
       The following macros are deprecated and should not be used in new code:
       in_interrupt()	- We're in NMI,IRQ,SoftIRQ context or have BH disabled
      
      To verify the current execution context new macro should be used instead:
       in_task()	- We're in task context
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6126891c
  2. 09 6月, 2021 1 次提交
  3. 11 5月, 2021 1 次提交
    • C
      rtnetlink: avoid RCU read lock when holding RTNL · a100243d
      Cong Wang 提交于
      When we call af_ops->set_link_af() we hold a RCU read lock
      as we retrieve af_ops from the RCU protected list, but this
      is unnecessary because we already hold RTNL lock, which is
      the writer lock for protecting rtnl_af_ops, so it is safer
      than RCU read lock. Similar for af_ops->validate_link_af().
      
      This was not a problem until we begin to take mutex lock
      down the path of ->set_link_af() in __ipv6_dev_mc_dec()
      recently. We can just drop the RCU read lock there and
      assert RTNL lock.
      
      Reported-and-tested-by: syzbot+7d941e89dd48bcf42573@syzkaller.appspotmail.com
      Fixes: 63ed8de4 ("mld: add mc_lock for protecting per-interface mld data")
      Tested-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NCong Wang <cong.wang@bytedance.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a100243d
  4. 09 4月, 2021 1 次提交
  5. 17 11月, 2020 1 次提交
  6. 11 11月, 2020 1 次提交
  7. 31 5月, 2020 1 次提交
  8. 17 5月, 2020 1 次提交
  9. 27 4月, 2020 1 次提交
  10. 10 4月, 2020 1 次提交
    • T
      net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin · 690cc863
      Taras Chornyi 提交于
      When CONFIG_IP_MULTICAST is not set and multicast ip is added to the device
      with autojoin flag or when multicast ip is deleted kernel will crash.
      
      steps to reproduce:
      
      ip addr add 224.0.0.0/32 dev eth0
      ip addr del 224.0.0.0/32 dev eth0
      
      or
      
      ip addr add 224.0.0.0/32 dev eth0 autojoin
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088
       pc : _raw_write_lock_irqsave+0x1e0/0x2ac
       lr : lock_sock_nested+0x1c/0x60
       Call trace:
        _raw_write_lock_irqsave+0x1e0/0x2ac
        lock_sock_nested+0x1c/0x60
        ip_mc_config.isra.28+0x50/0xe0
        inet_rtm_deladdr+0x1a8/0x1f0
        rtnetlink_rcv_msg+0x120/0x350
        netlink_rcv_skb+0x58/0x120
        rtnetlink_rcv+0x14/0x20
        netlink_unicast+0x1b8/0x270
        netlink_sendmsg+0x1a0/0x3b0
        ____sys_sendmsg+0x248/0x290
        ___sys_sendmsg+0x80/0xc0
        __sys_sendmsg+0x68/0xc0
        __arm64_sys_sendmsg+0x20/0x30
        el0_svc_common.constprop.2+0x88/0x150
        do_el0_svc+0x20/0x80
       el0_sync_handler+0x118/0x190
        el0_sync+0x140/0x180
      
      Fixes: 93a714d6 ("multicast: Extend ip address command to enable multicast group join/leave on")
      Signed-off-by: NTaras Chornyi <taras.chornyi@plvision.eu>
      Signed-off-by: NVadym Kochan <vadym.kochan@plvision.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690cc863
  11. 13 3月, 2020 1 次提交
  12. 08 12月, 2019 1 次提交
    • E
      inet: protect against too small mtu values. · 501a90c9
      Eric Dumazet 提交于
      syzbot was once again able to crash a host by setting a very small mtu
      on loopback device.
      
      Let's make inetdev_valid_mtu() available in include/net/ip.h,
      and use it in ip_setup_cork(), so that we protect both ip_append_page()
      and __ip_append_data()
      
      Also add a READ_ONCE() when the device mtu is read.
      
      Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
      even if other code paths might write over this field.
      
      Add a big comment in include/linux/netdevice.h about dev->mtu
      needing READ_ONCE()/WRITE_ONCE() annotations.
      
      Hopefully we will add the missing ones in followup patches.
      
      [1]
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       panic+0x2e3/0x75c kernel/panic.c:221
       __warn.cold+0x2f/0x3e kernel/panic.c:582
       report_bug+0x289/0x300 lib/bug.c:195
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       fixup_bug arch/x86/kernel/traps.c:169 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
       invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
      RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
      RSP: 0018:ffff88809689f550 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
      RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
      R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
      R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
       refcount_add include/linux/refcount.h:193 [inline]
       skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
       sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
       ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
       udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
       inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
       kernel_sendpage+0x92/0xf0 net/socket.c:3794
       sock_sendpage+0x8b/0xc0 net/socket.c:936
       pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
       splice_from_pipe+0x108/0x170 fs/splice.c:671
       generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
       do_splice_from fs/splice.c:861 [inline]
       direct_splice_actor+0x123/0x190 fs/splice.c:1035
       splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441409
      Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
      RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
      RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
      R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
      R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 1470ddf7 ("inet: Remove explicit write references to sk/inet in ip_append_data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      501a90c9
  13. 02 7月, 2019 1 次提交
    • M
      ipv4: don't set IPv6 only flags to IPv4 addresses · 2e605463
      Matteo Croce 提交于
      Avoid the situation where an IPV6 only flag is applied to an IPv4 address:
      
          # ip addr add 192.0.2.1/24 dev dummy0 nodad home mngtmpaddr noprefixroute
          # ip -4 addr show dev dummy0
          2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
              inet 192.0.2.1/24 scope global noprefixroute dummy0
                 valid_lft forever preferred_lft forever
      
      Or worse, by sending a malicious netlink command:
      
          # ip -4 addr show dev dummy0
          2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
              inet 192.0.2.1/24 scope global nodad optimistic dadfailed home tentative mngtmpaddr noprefixroute stable-privacy dummy0
                 valid_lft forever preferred_lft forever
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e605463
  14. 28 6月, 2019 1 次提交
  15. 25 6月, 2019 2 次提交
    • S
      ipv4: fix confirm_addr_indev() when enable route_localnet · 650638a7
      Shijie Luo 提交于
      When arp_ignore=3, the NIC won't reply for scope host addresses, but
      if enable route_locanet, we need to reply ip address with head 127 and
      scope RT_SCOPE_HOST.
      
      Fixes: d0daebc3 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
      Signed-off-by: NShijie Luo <luoshijie1@huawei.com>
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      650638a7
    • S
      ipv4: fix inet_select_addr() when enable route_localnet · d8c444d5
      Shijie Luo 提交于
      Suppose we have two interfaces eth0 and eth1 in two hosts, follow
      the same steps in the two hosts:
       # sysctl -w net.ipv4.conf.eth1.route_localnet=1
       # sysctl -w net.ipv4.conf.eth1.arp_announce=2
       # ip route del 127.0.0.0/8 dev lo table local
      and then set ip to eth1 in host1 like:
       # ifconfig eth1 127.25.3.4/24
      set ip to eth2 in host2 and ping host1:
       # ifconfig eth1 127.25.3.14/24
       # ping -I eth1 127.25.3.4
      Well, host2 cannot connect to host1.
      
      When set a ip address with head 127, the scope of the address defaults
      to RT_SCOPE_HOST. In this situation, host2 will use arp_solicit() to
      send a arp request for the mac address of host1 with ip
      address 127.25.3.14. When arp_announce=2, inet_select_addr() cannot
      select a correct saddr with condition ifa->ifa_scope > scope, because
      ifa_scope is RT_SCOPE_HOST and scope is RT_SCOPE_LINK. Then,
      inet_select_addr() will go to no_in_dev to lookup all interfaces to find
      a primary ip and finally get the primary ip of eth0.
      
      Here I add a localnet_scope defaults to RT_SCOPE_HOST, and when
      route_localnet is enabled, this value changes to RT_SCOPE_LINK to make
      inet_select_addr() find a correct primary ip as saddr of arp request.
      
      Fixes: d0daebc3 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
      Signed-off-by: NShijie Luo <luoshijie1@huawei.com>
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8c444d5
  16. 18 6月, 2019 1 次提交
    • F
      net: ipv4: remove erroneous advancement of list pointer · 40008e92
      Florian Westphal 提交于
      Causes crash when lifetime expires on an adress as garbage is
      dereferenced soon after.
      
      This used to look like this:
      
       for (ifap = &ifa->ifa_dev->ifa_list;
            *ifap != NULL; ifap = &(*ifap)->ifa_next) {
                if (*ifap == ifa) ...
      
      but this was changed to:
      
      struct in_ifaddr *tmp;
      
      ifap = &ifa->ifa_dev->ifa_list;
      tmp = rtnl_dereference(*ifap);
      while (tmp) {
         tmp = rtnl_dereference(tmp->ifa_next); // Bogus
         if (rtnl_dereference(*ifap) == ifa) {
           ...
         ifap = &tmp->ifa_next;		// Can be NULL
         tmp = rtnl_dereference(*ifap);	// Dereference
         }
      }
      
      Remove the bogus assigment/list entry skip.
      
      Fixes: 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40008e92
  17. 05 6月, 2019 1 次提交
  18. 03 6月, 2019 3 次提交
  19. 31 5月, 2019 1 次提交
  20. 28 4月, 2019 1 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
  21. 05 3月, 2019 1 次提交
  22. 23 1月, 2019 1 次提交
    • C
      net: introduce a knob to control whether to inherit devconf config · 856c395c
      Cong Wang 提交于
      There have been many people complaining about the inconsistent
      behaviors of IPv4 and IPv6 devconf when creating new network
      namespaces.  Currently, for IPv4, we inherit all current settings
      from init_net, but for IPv6 we reset all setting to default.
      
      This patch introduces a new /proc file
      /proc/sys/net/core/devconf_inherit_init_net to control the
      behavior of whether to inhert sysctl current settings from init_net.
      This file itself is only available in init_net.
      
      As demonstrated below:
      
      Initial setup in init_net:
       # cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Default value 0 (current behavior):
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to 1 (inherit from init_net):
       # echo 1 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       2
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       1
      
      Set to 2 (reset to default):
       # echo 2 > /proc/sys/net/core/devconf_inherit_init_net
       # ip netns del test
       # ip netns add test
       # ip netns exec test cat /proc/sys/net/ipv4/conf/all/rp_filter
       0
       # ip netns exec test cat /proc/sys/net/ipv6/conf/all/accept_dad
       0
      
      Set to a value out of range (invalid):
       # echo 3 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
       # echo -1 > /proc/sys/net/core/devconf_inherit_init_net
       -bash: echo: write error: Invalid argument
      Reported-by: NZhu Yanjun <Yanjun.Zhu@windriver.com>
      Reported-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NTonghao Zhang <xiangxia.m.yue@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      856c395c
  23. 20 1月, 2019 1 次提交
  24. 05 1月, 2019 1 次提交
    • A
      netlink: fixup regression in RTM_GETADDR · 7c1e8a38
      Arthur Gautier 提交于
      This commit fixes a regression in AF_INET/RTM_GETADDR and
      AF_INET6/RTM_GETADDR.
      
      Before this commit, the kernel would stop dumping addresses once the first
      skb was full and end the stream with NLMSG_DONE(-EMSGSIZE). The error
      shouldn't be sent back to netlink_dump so the callback is kept alive. The
      userspace is expected to call back with a new empty skb.
      
      Changes from V1:
       - The error is not handled in netlink_dump anymore but rather in
         inet_dump_ifaddr and inet6_dump_addr directly as suggested by
         David Ahern.
      
      Fixes: d7e38611 ("net/ipv4: Put target net when address dump fails due to bad attributes")
      Fixes: 242afaa6 ("net/ipv6: Put target net when address dump fails due to bad attributes")
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "David S . Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NArthur Gautier <baloo@gandi.net>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c1e8a38
  25. 15 12月, 2018 1 次提交
    • D
      net: Allow class-e address assignment via ifconfig ioctl · 65cab850
      Dave Taht 提交于
      While most distributions long ago switched to the iproute2 suite
      of utilities, which allow class-e (240.0.0.0/4) address assignment,
      distributions relying on busybox, toybox and other forms of
      ifconfig cannot assign class-e addresses without this kernel patch.
      
      While CIDR has been obsolete for 2 decades, and a survey of all the
      open source code in the world shows the IN_whatever macros are also
      obsolete... rather than obsolete CIDR from this ioctl entirely, this
      patch merely enables class-e assignment, sanely.
      Signed-off-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65cab850
  26. 07 12月, 2018 1 次提交
  27. 26 10月, 2018 1 次提交
  28. 25 10月, 2018 1 次提交
  29. 23 10月, 2018 2 次提交
  30. 09 10月, 2018 3 次提交
  31. 06 9月, 2018 2 次提交
    • C
      ipv4: add inet_fill_args · 978a46fa
      Christian Brauner 提交于
      inet_fill_ifaddr() already took 6 arguments which meant the 7th argument
      would need to be pushed onto the stack on x86.
      Add a new struct inet_fill_args which holds common information passed
      to inet_fill_ifaddr() and shortens the function to three pointer arguments.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      978a46fa
    • C
      ipv4: enable IFA_TARGET_NETNSID for RTM_GETADDR · d3807145
      Christian Brauner 提交于
      - Backwards Compatibility:
        If userspace wants to determine whether ipv4 RTM_GETADDR requests
        support the new IFA_TARGET_NETNSID property it should verify that the
        reply includes the IFA_TARGET_NETNSID property. If it does not
        userspace should assume that IFA_TARGET_NETNSID is not supported for
        ipv4 RTM_GETADDR requests on this kernel.
      - From what I gather from current userspace tools that make use of
        RTM_GETADDR requests some of them pass down struct ifinfomsg when they
        should actually pass down struct ifaddrmsg. To not break existing
        tools that pass down the wrong struct we will do the same as for
        RTM_GETLINK | NLM_F_DUMP requests and not error out when the
        nlmsg_parse() fails.
      
      - Security:
        Callers must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3807145
  32. 30 7月, 2018 1 次提交
    • X
      route: add support for directed broadcast forwarding · 5cbf777c
      Xin Long 提交于
      This patch implements the feature described in rfc1812#section-5.3.5.2
      and rfc2644. It allows the router to forward directed broadcast when
      sysctl bc_forwarding is enabled.
      
      Note that this feature could be done by iptables -j TEE, but it would
      cause some problems:
        - target TEE's gateway param has to be set with a specific address,
          and it's not flexible especially when the route wants forward all
          directed broadcasts.
        - this duplicates the directed broadcasts so this may cause side
          effects to applications.
      
      Besides, to keep consistent with other os router like BSD, it's also
      necessary to implement it in the route rx path.
      
      Note that route cache needs to be flushed when bc_forwarding is
      changed.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cbf777c
  33. 29 5月, 2018 1 次提交